feat: complete TDD8 implementation of markdown file explosion - Issue #138
Complete implementation of md-explode command for transforming single markdown files into organized directory structures: Core Implementation: - MarkdownSection class for hierarchical document modeling - extract_headings() - Parse markdown headings with levels - parse_markdown_structure() - Build section hierarchy from content - generate_safe_filename() - Convert headings to filesystem-safe names - explode_markdown_file() - Main explosion functionality - DirectoryStructureBuilder - Create organized file/directory structures CLI Integration: - md-explode command with comprehensive options - --dry-run for previewing structure - --verbose for detailed output - --max-depth for limiting nesting - --output-dir for custom output location Key Features: - Hierarchical structure preservation (# → ## → ###) - Smart filename generation with Unicode support - Front matter handling and preservation - Content integrity maintenance - Cross-platform filesystem compatibility - Comprehensive error handling and validation Refactoring Applied: - Eliminated code duplication between filename functions - Extracted front matter processing into dedicated function - Modularized CLI command with helper functions - Improved error handling and user feedback Documentation: - Complete API documentation with docstrings - Comprehensive user documentation (docs/md-explode-command.md) - Usage examples and troubleshooting guide - Integration instructions with other MarkiTect commands Testing: 47 comprehensive tests covering all functionality Status: Production-ready, full TDD8 cycle completed Performance: Efficient for documents with thousands of sections 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
238
docs/md-explode-command.md
Normal file
238
docs/md-explode-command.md
Normal file
@@ -0,0 +1,238 @@
|
||||
# MD-Explode Command Documentation
|
||||
|
||||
## Overview
|
||||
|
||||
The `md-explode` command transforms a single markdown file with hierarchical structure into an organized directory tree, where each heading becomes a separate file or directory. This is particularly useful for managing large documents like books, technical documentation, or structured reports.
|
||||
|
||||
## Installation
|
||||
|
||||
The `md-explode` command is built into MarkiTect as part of the markdown commands plugin. No additional installation is required.
|
||||
|
||||
## Usage
|
||||
|
||||
### Basic Syntax
|
||||
```bash
|
||||
markitect md-explode <input_file> [OPTIONS]
|
||||
```
|
||||
|
||||
### Parameters
|
||||
|
||||
#### Required
|
||||
- `INPUT_FILE` - Path to the markdown file to explode
|
||||
|
||||
#### Options
|
||||
- `--output-dir, -o PATH` - Output directory for exploded files (default: `<filename>_exploded/`)
|
||||
- `--max-depth INTEGER` - Maximum directory nesting depth (default: 10)
|
||||
- `--dry-run` - Preview what would be created without actually creating files
|
||||
- `--verbose, -v` - Show detailed output during processing
|
||||
|
||||
## Examples
|
||||
|
||||
### Basic Usage
|
||||
```bash
|
||||
# Explode book.md into book_exploded/ directory
|
||||
markitect md-explode book.md
|
||||
```
|
||||
|
||||
### Custom Output Directory
|
||||
```bash
|
||||
# Explode into a specific directory
|
||||
markitect md-explode documentation.md --output-dir ./chapters/
|
||||
```
|
||||
|
||||
### Preview Mode
|
||||
```bash
|
||||
# See what structure would be created without creating files
|
||||
markitect md-explode large-document.md --dry-run --verbose
|
||||
```
|
||||
|
||||
### Verbose Output
|
||||
```bash
|
||||
# Get detailed information about the explosion process
|
||||
markitect md-explode technical-guide.md --verbose
|
||||
```
|
||||
|
||||
## Input Format
|
||||
|
||||
The command expects markdown files with hierarchical heading structure:
|
||||
|
||||
```markdown
|
||||
# Part 1: Introduction
|
||||
Introduction content here.
|
||||
|
||||
## Chapter 1: Getting Started
|
||||
Chapter content here.
|
||||
|
||||
### Section 1.1: Installation
|
||||
Installation instructions.
|
||||
|
||||
### Section 1.2: Configuration
|
||||
Configuration details.
|
||||
|
||||
## Chapter 2: Advanced Topics
|
||||
Advanced content.
|
||||
|
||||
# Part 2: Reference
|
||||
Reference material.
|
||||
```
|
||||
|
||||
## Output Structure
|
||||
|
||||
The command creates a directory structure that mirrors the document hierarchy:
|
||||
|
||||
```
|
||||
document_exploded/
|
||||
├── part_1_introduction/
|
||||
│ ├── index.md # Part introduction content
|
||||
│ ├── chapter_1_getting_started/
|
||||
│ │ ├── index.md # Chapter content
|
||||
│ │ ├── section_11_installation.md
|
||||
│ │ └── section_12_configuration.md
|
||||
│ └── chapter_2_advanced_topics.md
|
||||
└── part_2_reference.md
|
||||
```
|
||||
|
||||
### Structure Rules
|
||||
|
||||
1. **Directories** are created for headings that have child sections
|
||||
2. **Files** are created for leaf sections (no children)
|
||||
3. **Index files** contain the content of parent sections
|
||||
4. **Nested structure** preserves the document hierarchy
|
||||
5. **Safe filenames** are generated from heading text
|
||||
|
||||
## Filename Generation
|
||||
|
||||
Headings are converted to filesystem-safe filenames using these rules:
|
||||
|
||||
- **Lowercase conversion**: "Chapter 1" → "chapter_1"
|
||||
- **Special character removal**: "What's New?" → "whats_new"
|
||||
- **Unicode normalization**: "Café & Résumé" → "cafe_resume"
|
||||
- **Number preservation**: "Section 1.1.1" → "section_1_1_1"
|
||||
- **Path character handling**: "File/Path Issues" → "file_path_issues"
|
||||
- **Length limiting**: Very long titles are truncated to 100 characters
|
||||
- **Conflict resolution**: Duplicate names get numbered suffixes
|
||||
|
||||
## Features
|
||||
|
||||
### Front Matter Support
|
||||
YAML front matter is automatically detected and handled:
|
||||
|
||||
```markdown
|
||||
---
|
||||
title: "My Document"
|
||||
author: "John Doe"
|
||||
---
|
||||
|
||||
# Chapter 1
|
||||
Content starts here...
|
||||
```
|
||||
|
||||
Front matter is preserved appropriately during the explosion process.
|
||||
|
||||
### Content Preservation
|
||||
- **Markdown formatting** is fully preserved in exploded files
|
||||
- **Code blocks** maintain their syntax highlighting
|
||||
- **Tables, lists, and links** are kept intact
|
||||
- **Images and media references** are preserved
|
||||
|
||||
### Error Handling
|
||||
- **Missing files**: Clear error messages for non-existent input files
|
||||
- **Permission errors**: Graceful handling of filesystem permission issues
|
||||
- **Malformed markdown**: Robust parsing that handles inconsistent heading levels
|
||||
- **Empty files**: Appropriate handling of files with no heading structure
|
||||
|
||||
## Advanced Usage
|
||||
|
||||
### Limiting Directory Depth
|
||||
```bash
|
||||
# Limit to 3 levels of nesting
|
||||
markitect md-explode complex-doc.md --max-depth 3
|
||||
```
|
||||
|
||||
When depth is exceeded, deeper sections are flattened into files rather than creating more directories.
|
||||
|
||||
### Working with Large Documents
|
||||
For very large documents, use dry-run mode first to preview the structure:
|
||||
|
||||
```bash
|
||||
markitect md-explode huge-manual.md --dry-run --verbose
|
||||
```
|
||||
|
||||
This helps you understand the output structure and estimate disk space requirements.
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Common Issues
|
||||
|
||||
**"No heading structure found"**
|
||||
- The markdown file contains no headings (`#`, `##`, etc.)
|
||||
- Solution: Add headings to structure your document
|
||||
|
||||
**"Permission denied"**
|
||||
- Insufficient permissions to write to the output directory
|
||||
- Solution: Check directory permissions or specify a different output location
|
||||
|
||||
**"File already exists"**
|
||||
- The output directory already exists and contains files
|
||||
- Solution: Choose a different output directory or remove existing files
|
||||
|
||||
**"Invalid markdown format"**
|
||||
- The input file is not valid markdown
|
||||
- Solution: Check the file format and fix any syntax errors
|
||||
|
||||
### Getting Help
|
||||
|
||||
```bash
|
||||
# Show command help
|
||||
markitect md-explode --help
|
||||
|
||||
# Show general MarkiTect help
|
||||
markitect --help
|
||||
```
|
||||
|
||||
## Best Practices
|
||||
|
||||
1. **Use descriptive headings** - They become directory and file names
|
||||
2. **Maintain consistent heading levels** - Don't skip from `#` to `###`
|
||||
3. **Keep headings concise** - Very long headings result in long filenames
|
||||
4. **Avoid special characters** in headings when possible
|
||||
5. **Preview first** - Use `--dry-run` for large documents
|
||||
6. **Backup originals** - Always keep a copy of your source markdown file
|
||||
|
||||
## Integration
|
||||
|
||||
The `md-explode` command works well with other MarkiTect commands:
|
||||
|
||||
```bash
|
||||
# Render exploded files to HTML
|
||||
markitect md-render exploded_directory/ --recursive
|
||||
|
||||
# Create an index of the exploded structure
|
||||
markitect md-index exploded_directory/ --recursive
|
||||
```
|
||||
|
||||
This creates a complete documentation workflow from single file to organized, rendered website.
|
||||
|
||||
## Technical Details
|
||||
|
||||
### Implementation
|
||||
- **Language**: Python 3.8+
|
||||
- **Dependencies**: Click for CLI, unicodedata for filename normalization
|
||||
- **Parser**: Custom markdown heading parser (no external markdown library required)
|
||||
- **Performance**: Efficient for documents up to thousands of sections
|
||||
|
||||
### File System Compatibility
|
||||
- **Cross-platform**: Works on Windows, macOS, and Linux
|
||||
- **Character encoding**: UTF-8 throughout
|
||||
- **Filename limits**: Respects filesystem limitations
|
||||
- **Path length**: Handles deep directory structures appropriately
|
||||
|
||||
## See Also
|
||||
|
||||
- [`md-render`](md-render-command.md) - Render markdown files to HTML
|
||||
- [`md-index`](md-index-command.md) - Generate index pages for directories
|
||||
- [`md-ingest`](md-ingest-command.md) - Import and process markdown files
|
||||
|
||||
---
|
||||
|
||||
*This documentation is for MarkiTect version 1.0+*
|
||||
Reference in New Issue
Block a user