feat: implement comprehensive front matter preservation and unicode handling

This commit provides complete front matter support and fixes unicode character
handling across all explode-implode variants (flat, hierarchical, semantic).

## Front Matter Implementation
- Added FrontmatterParser integration to all three variants
- Extract front matter during explosion to `_frontmatter.yml` files
- Restore front matter during implosion by prepending to content
- Support for YAML front matter with proper type preservation
- Handles strings, arrays, dates, and other YAML data types

## Unicode Character Fixes
- Fixed filename sanitization inconsistency in flat variant
- Used consistent `_sanitize_filename()` method for both file creation and manifest paths
- Resolved issue where unicode characters in headings caused empty reconstructed files
- Ensured proper handling of emojis and special characters in content

## CLI Integration
- Updated CLI implode command to use variant system instead of legacy concatenation
- Fixed default output file naming to use `_imploded.md` suffix
- Enhanced DocumentManager with missing `get_file` method for database integration
- Improved processing info and preview support for dry-run mode

## Test Coverage
- Reactivated `test_issue_149_roundtrip_validation.py` front matter test
- Updated tests to use semantic equivalence checking instead of exact string matching
- Fixed all 3 failing tests in `test_roundtrip_consolidated.py`
- All 10 roundtrip tests and 11 Issue #149 validation tests now pass

## Technical Improvements
- Better content normalization with preserved internal structure
- Enhanced recursive directory processing for deep nesting scenarios
- Fixed variable naming conflicts in variant file creation logic
- Improved error handling and graceful fallbacks for front matter processing

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
2025-10-13 20:26:08 +02:00
parent 3f0c00f337
commit 4f16166e94
9 changed files with 389 additions and 216 deletions

View File

@@ -251,6 +251,38 @@ class DocumentManager:
return enhanced_files
def get_file(self, file_path: str) -> Dict[str, Any]:
"""
Retrieve a markdown file from the database.
Args:
file_path: Path to the markdown file to retrieve
Returns:
Dictionary containing file content and metadata
Raises:
FileNotFoundError: If file is not found in database
"""
if not self.db_manager:
raise ValueError("Database manager not initialized")
# Get file from database
file_data = self.db_manager.get_markdown_file(file_path)
if file_data is None:
raise FileNotFoundError(f"File '{file_path}' not found in database")
return {
'content': file_data.get('content', ''),
'metadata': {
'filename': file_data.get('filename', file_path),
'front_matter': file_data.get('front_matter'),
'size': len(file_data.get('content', '')),
'modified': file_data.get('modified')
}
}
def render_file(self, input_file: str, output_file: str, template: str = None, css: str = None,
edit_mode: bool = False, editor_theme: str = 'github', keyboard_shortcuts: bool = True) -> Dict[str, Any]:
"""