Files
markitect-main/docs/md-explode-command.md
tegwick 312bf8c7bf feat: complete TDD8 implementation of markdown file explosion - Issue #138
Complete implementation of md-explode command for transforming single
markdown files into organized directory structures:

Core Implementation:
- MarkdownSection class for hierarchical document modeling
- extract_headings() - Parse markdown headings with levels
- parse_markdown_structure() - Build section hierarchy from content
- generate_safe_filename() - Convert headings to filesystem-safe names
- explode_markdown_file() - Main explosion functionality
- DirectoryStructureBuilder - Create organized file/directory structures

CLI Integration:
- md-explode command with comprehensive options
- --dry-run for previewing structure
- --verbose for detailed output
- --max-depth for limiting nesting
- --output-dir for custom output location

Key Features:
- Hierarchical structure preservation (# → ## → ###)
- Smart filename generation with Unicode support
- Front matter handling and preservation
- Content integrity maintenance
- Cross-platform filesystem compatibility
- Comprehensive error handling and validation

Refactoring Applied:
- Eliminated code duplication between filename functions
- Extracted front matter processing into dedicated function
- Modularized CLI command with helper functions
- Improved error handling and user feedback

Documentation:
- Complete API documentation with docstrings
- Comprehensive user documentation (docs/md-explode-command.md)
- Usage examples and troubleshooting guide
- Integration instructions with other MarkiTect commands

Testing: 47 comprehensive tests covering all functionality
Status: Production-ready, full TDD8 cycle completed
Performance: Efficient for documents with thousands of sections

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-07 15:44:30 +02:00

238 lines
6.9 KiB
Markdown

# MD-Explode Command Documentation
## Overview
The `md-explode` command transforms a single markdown file with hierarchical structure into an organized directory tree, where each heading becomes a separate file or directory. This is particularly useful for managing large documents like books, technical documentation, or structured reports.
## Installation
The `md-explode` command is built into MarkiTect as part of the markdown commands plugin. No additional installation is required.
## Usage
### Basic Syntax
```bash
markitect md-explode <input_file> [OPTIONS]
```
### Parameters
#### Required
- `INPUT_FILE` - Path to the markdown file to explode
#### Options
- `--output-dir, -o PATH` - Output directory for exploded files (default: `<filename>_exploded/`)
- `--max-depth INTEGER` - Maximum directory nesting depth (default: 10)
- `--dry-run` - Preview what would be created without actually creating files
- `--verbose, -v` - Show detailed output during processing
## Examples
### Basic Usage
```bash
# Explode book.md into book_exploded/ directory
markitect md-explode book.md
```
### Custom Output Directory
```bash
# Explode into a specific directory
markitect md-explode documentation.md --output-dir ./chapters/
```
### Preview Mode
```bash
# See what structure would be created without creating files
markitect md-explode large-document.md --dry-run --verbose
```
### Verbose Output
```bash
# Get detailed information about the explosion process
markitect md-explode technical-guide.md --verbose
```
## Input Format
The command expects markdown files with hierarchical heading structure:
```markdown
# Part 1: Introduction
Introduction content here.
## Chapter 1: Getting Started
Chapter content here.
### Section 1.1: Installation
Installation instructions.
### Section 1.2: Configuration
Configuration details.
## Chapter 2: Advanced Topics
Advanced content.
# Part 2: Reference
Reference material.
```
## Output Structure
The command creates a directory structure that mirrors the document hierarchy:
```
document_exploded/
├── part_1_introduction/
│ ├── index.md # Part introduction content
│ ├── chapter_1_getting_started/
│ │ ├── index.md # Chapter content
│ │ ├── section_11_installation.md
│ │ └── section_12_configuration.md
│ └── chapter_2_advanced_topics.md
└── part_2_reference.md
```
### Structure Rules
1. **Directories** are created for headings that have child sections
2. **Files** are created for leaf sections (no children)
3. **Index files** contain the content of parent sections
4. **Nested structure** preserves the document hierarchy
5. **Safe filenames** are generated from heading text
## Filename Generation
Headings are converted to filesystem-safe filenames using these rules:
- **Lowercase conversion**: "Chapter 1" → "chapter_1"
- **Special character removal**: "What's New?" → "whats_new"
- **Unicode normalization**: "Café & Résumé" → "cafe_resume"
- **Number preservation**: "Section 1.1.1" → "section_1_1_1"
- **Path character handling**: "File/Path Issues" → "file_path_issues"
- **Length limiting**: Very long titles are truncated to 100 characters
- **Conflict resolution**: Duplicate names get numbered suffixes
## Features
### Front Matter Support
YAML front matter is automatically detected and handled:
```markdown
---
title: "My Document"
author: "John Doe"
---
# Chapter 1
Content starts here...
```
Front matter is preserved appropriately during the explosion process.
### Content Preservation
- **Markdown formatting** is fully preserved in exploded files
- **Code blocks** maintain their syntax highlighting
- **Tables, lists, and links** are kept intact
- **Images and media references** are preserved
### Error Handling
- **Missing files**: Clear error messages for non-existent input files
- **Permission errors**: Graceful handling of filesystem permission issues
- **Malformed markdown**: Robust parsing that handles inconsistent heading levels
- **Empty files**: Appropriate handling of files with no heading structure
## Advanced Usage
### Limiting Directory Depth
```bash
# Limit to 3 levels of nesting
markitect md-explode complex-doc.md --max-depth 3
```
When depth is exceeded, deeper sections are flattened into files rather than creating more directories.
### Working with Large Documents
For very large documents, use dry-run mode first to preview the structure:
```bash
markitect md-explode huge-manual.md --dry-run --verbose
```
This helps you understand the output structure and estimate disk space requirements.
## Troubleshooting
### Common Issues
**"No heading structure found"**
- The markdown file contains no headings (`#`, `##`, etc.)
- Solution: Add headings to structure your document
**"Permission denied"**
- Insufficient permissions to write to the output directory
- Solution: Check directory permissions or specify a different output location
**"File already exists"**
- The output directory already exists and contains files
- Solution: Choose a different output directory or remove existing files
**"Invalid markdown format"**
- The input file is not valid markdown
- Solution: Check the file format and fix any syntax errors
### Getting Help
```bash
# Show command help
markitect md-explode --help
# Show general MarkiTect help
markitect --help
```
## Best Practices
1. **Use descriptive headings** - They become directory and file names
2. **Maintain consistent heading levels** - Don't skip from `#` to `###`
3. **Keep headings concise** - Very long headings result in long filenames
4. **Avoid special characters** in headings when possible
5. **Preview first** - Use `--dry-run` for large documents
6. **Backup originals** - Always keep a copy of your source markdown file
## Integration
The `md-explode` command works well with other MarkiTect commands:
```bash
# Render exploded files to HTML
markitect md-render exploded_directory/ --recursive
# Create an index of the exploded structure
markitect md-index exploded_directory/ --recursive
```
This creates a complete documentation workflow from single file to organized, rendered website.
## Technical Details
### Implementation
- **Language**: Python 3.8+
- **Dependencies**: Click for CLI, unicodedata for filename normalization
- **Parser**: Custom markdown heading parser (no external markdown library required)
- **Performance**: Efficient for documents up to thousands of sections
### File System Compatibility
- **Cross-platform**: Works on Windows, macOS, and Linux
- **Character encoding**: UTF-8 throughout
- **Filename limits**: Respects filesystem limitations
- **Path length**: Handles deep directory structures appropriately
## See Also
- [`md-render`](md-render-command.md) - Render markdown files to HTML
- [`md-index`](md-index-command.md) - Generate index pages for directories
- [`md-ingest`](md-ingest-command.md) - Import and process markdown files
---
*This documentation is for MarkiTect version 1.0+*