feat: implement index page generation for HTML directories - Issue #136
Complete TDD8 implementation of index page generation functionality: Core Features: - HTML file discovery with optional recursive search (find_html_files) - Smart title extraction from <title>, <h1>, or filename (extract_html_title) - Template-integrated index page generation (generate_index_html) - CLI command 'md-index' with output, template, and recursive options - Comprehensive error handling for edge cases and malformed files Implementation Details: - Reuses existing TEMPLATE_STYLES for consistent styling across all templates - Proper relative path resolution for cross-directory navigation - Modular design with helper functions for maintainability - HTML parsing patterns extracted as module-level constants for performance Tests: 23 comprehensive tests covering discovery, generation, CLI integration, and edge cases Files: markitect/plugins/builtin/markdown_commands.py, tests/test_issue_136_index_generation.py Status: All tests passing, full TDD8 cycle completed (RED→GREEN→REFACTOR→DOCUMENT) 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
68
README.html
Normal file
68
README.html
Normal file
@@ -0,0 +1,68 @@
|
||||
<!DOCTYPE html>
|
||||
<html lang="en">
|
||||
<head>
|
||||
<meta charset="utf-8">
|
||||
<meta name="viewport" content="width=device-width, initial-scale=1.0">
|
||||
<title>README</title>
|
||||
<style>
|
||||
body {
|
||||
font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', 'Roboto', 'Helvetica', 'Arial', sans-serif;
|
||||
line-height: 1.6;
|
||||
max-width: 800px;
|
||||
margin: 0 auto;
|
||||
padding: 20px;
|
||||
color: #333;
|
||||
|
||||
|
||||
}
|
||||
#markdown-content {
|
||||
margin: 0;
|
||||
}
|
||||
h1, h2, h3, h4, h5, h6 {
|
||||
color: #2c3e50;
|
||||
|
||||
}
|
||||
pre {
|
||||
background-color: #f4f4f4;
|
||||
|
||||
padding: 15px;
|
||||
border-radius: 5px;
|
||||
overflow-x: auto;
|
||||
}
|
||||
code {
|
||||
background-color: #f4f4f4;
|
||||
|
||||
padding: 2px 4px;
|
||||
border-radius: 3px;
|
||||
}
|
||||
blockquote {
|
||||
border-left: 4px solid #ddd;
|
||||
margin: 0;
|
||||
padding-left: 20px;
|
||||
color: #666;
|
||||
}
|
||||
|
||||
</style>
|
||||
</head>
|
||||
<body>
|
||||
<div id="markdown-content"></div>
|
||||
|
||||
<script src="https://cdn.jsdelivr.net/npm/marked/marked.min.js"></script>
|
||||
<script>
|
||||
// Embedded markdown payload
|
||||
const markdownContent = "MarkiTect - Advanced Markdown Engine\n\nYour Markdown, Redefined.\n\nMarkiTect transforms markdown from plain text into intelligent, structured data with performance optimization, schema validation, and relational querying capabilities. Stop treating documentation as text files\u2014start managing it as a database.\n\n**Key Features:**\n- **Lightning Performance**: 60-85% faster document processing through intelligent AST caching\n- **Schema Validation**: Enforce document structure and consistency\n- **Database Integration**: Query markdown content with SQL-like operations\n- **CLI Tools**: Complete command-line interface for automation and workflows\n\n## \ud83d\udcda Documentation\n\n**Quick Start:** [Getting Started](#getting-started) \u00b7 [Command Reference](docs/user-guides/cache-management.md)\n\n**Architecture:** [Caching System](docs/architecture/caching-system.md) \u00b7 [Performance Philosophy](docs/#performance-philosophy)\n\n**Development:** [TDD Workflow](docs/development/tdd-workflow.md) \u00b7 [Contributing](#contributing)\n\n**Project Status:** [Current Status](history/ProjectStatusDigest.md) \u00b7 [Roadmap](history/ROADMAP.md) \u00b7 [Next Actions](NEXT.md)\n";
|
||||
const frontMatter = {};
|
||||
|
||||
// Render markdown on page load
|
||||
document.addEventListener('DOMContentLoaded', function() {
|
||||
if (typeof marked !== 'undefined') {
|
||||
document.getElementById('markdown-content').innerHTML = marked.parse(markdownContent);
|
||||
} else {
|
||||
// Fallback if marked.js fails to load
|
||||
document.getElementById('markdown-content').innerHTML =
|
||||
'<pre>' + markdownContent.replace(/</g, '<').replace(/>/g, '>') + '</pre>';
|
||||
}
|
||||
});
|
||||
</script>
|
||||
</body>
|
||||
</html>
|
||||
139
cost_notes/issue_135_cost_2025-10-07.md
Normal file
139
cost_notes/issue_135_cost_2025-10-07.md
Normal file
@@ -0,0 +1,139 @@
|
||||
---
|
||||
note_type: "issue_cost_tracking"
|
||||
issue_id: 135
|
||||
issue_title: "Instant Markdown base and publication directory"
|
||||
session_date: "2025-10-07"
|
||||
claude_model: "claude-sonnet-4"
|
||||
total_cost_eur: 0.4416
|
||||
total_cost_usd: 0.48
|
||||
total_tokens: 63000
|
||||
generated_at: "2025-10-07T11:30:00.000000"
|
||||
---
|
||||
|
||||
# Issue #135 Implementation Cost
|
||||
**Issue**: Instant Markdown base and publication directory
|
||||
**Date**: 2025-10-07
|
||||
**Claude Model**: claude-sonnet-4
|
||||
|
||||
## Cost Summary
|
||||
- **Total Cost**: €0.4416 ($0.48 USD)
|
||||
- **Token Usage**: 63,000 tokens
|
||||
- **Input Tokens**: 38,000 tokens @ $3.00/M
|
||||
- **Output Tokens**: 25,000 tokens @ $15.00/M
|
||||
|
||||
## Cost Breakdown
|
||||
|
||||
| Component | Tokens | Rate ($/M) | Cost (USD) | Cost (EUR) |
|
||||
|-----------|--------|------------|------------|------------|
|
||||
| Input | 38,000 | $3.00 | $0.1140 | €0.1049 |
|
||||
| Output | 25,000 | $15.00 | $0.3750 | €0.3450 |
|
||||
| **Total** | 63,000 | - | $0.4890 | €0.4499 |
|
||||
|
||||
## Implementation Summary
|
||||
Complete TDD8 workflow implementation of instant markdown base and publication directory functionality. Extended md-render command to support both single files and directory processing with publication directory management, environment variable override, and CLI flags for behavior control. Generated comprehensive test suite with 18 tests covering all scenarios from publication directory management to edge cases.
|
||||
|
||||
## Key Features Delivered
|
||||
- Publication directory support with ~/Notes/ default and MARKITECT_PUBLICATION_DIR override
|
||||
- Single file processing with --use-publication-dir flag
|
||||
- Directory processing with recursive traversal and structure preservation
|
||||
- --dont-use-publication-dir flag for placing HTML next to MD files
|
||||
- Comprehensive CLI integration with detailed help documentation
|
||||
- 9 new helper functions for directory/file processing
|
||||
- Full backward compatibility maintained
|
||||
- Extensive test coverage (18 tests, 100% pass rate)
|
||||
|
||||
## Technical Implementation Details
|
||||
- Modified md-render command with new CLI options and directory support
|
||||
- Added publication directory management functions (get_publication_directory, normalize_publication_path, ensure_publication_directory)
|
||||
- Implemented file processing functions (process_single_file, process_directory, find_markdown_files)
|
||||
- Created utility functions (get_output_filename, get_relative_output_path, _render_single_markdown_file)
|
||||
- Updated command help documentation with examples and usage patterns
|
||||
- Comprehensive error handling and edge case management
|
||||
|
||||
## Test Coverage Breakdown
|
||||
1. **Publication Directory Management** (4 tests): Default directory, environment variable override, directory creation, path normalization
|
||||
2. **Single File Processing** (3 tests): Default behavior, publication directory usage, naming conventions
|
||||
3. **Directory Processing** (4 tests): Publication directory with structure, HTML next to MD, recursive traversal, structure preservation
|
||||
4. **CLI Integration** (4 tests): Flag presence, directory input support, environment variable integration
|
||||
5. **Edge Cases** (3 tests): Empty directories, mixed content, error handling
|
||||
|
||||
## Cost Allocation
|
||||
This cost has been allocated to the 'AI & ML Services' category as a one-time expense for issue #135 implementation using full TDD8 methodology including requirements analysis, test design, implementation, refactoring, documentation, and integration.
|
||||
|
||||
## Implementation Methodology
|
||||
- **TDD8 Workflow**: Complete ISSUE→TEST→RED→GREEN→REFACTOR→DOCUMENT→REFINE→PUBLISH cycle
|
||||
- **Test-Driven Approach**: 18 tests written first (RED state), then implementation (GREEN state)
|
||||
- **Code Quality**: Refactoring phase ensured clean, maintainable code
|
||||
- **Documentation**: Comprehensive implementation and test plan documentation
|
||||
- **Integration**: Full CLI integration with help text and examples
|
||||
|
||||
## Performance Metrics
|
||||
- **Development Time**: Full TDD8 cycle implementation
|
||||
- **Test Success Rate**: 18/18 tests passing (100%)
|
||||
- **Code Quality**: Clean, well-documented, modular implementation
|
||||
- **CLI Integration**: Complete with comprehensive help documentation
|
||||
- **Backward Compatibility**: Maintained for all existing functionality
|
||||
|
||||
## Notes
|
||||
- Currency conversion rate: 1 USD = 0.920 EUR
|
||||
- Pricing based on claude-sonnet-4 rates as of 2025-10-07
|
||||
- Token counts estimated based on comprehensive TDD8 implementation session
|
||||
- Includes requirements engineering, test generation, RED-GREEN-REFACTOR cycle, documentation, and final integration
|
||||
- Higher token count than typical due to extensive directory processing logic and comprehensive test coverage
|
||||
|
||||
<!--
|
||||
contentmatter:
|
||||
{
|
||||
"cost_tracking": {
|
||||
"issue": {
|
||||
"id": 135,
|
||||
"title": "Instant Markdown base and publication directory",
|
||||
"implementation_date": "2025-10-07"
|
||||
},
|
||||
"session": {
|
||||
"model": "claude-sonnet-4",
|
||||
"token_usage": {
|
||||
"input_tokens": 38000,
|
||||
"output_tokens": 25000,
|
||||
"total_tokens": 63000
|
||||
},
|
||||
"costs": {
|
||||
"input_cost_usd": 0.114,
|
||||
"output_cost_usd": 0.375,
|
||||
"total_cost_usd": 0.489,
|
||||
"total_cost_eur": 0.4499,
|
||||
"conversion_rate": 0.92
|
||||
},
|
||||
"pricing_rates": {
|
||||
"input_per_million": 3.0,
|
||||
"output_per_million": 15.0
|
||||
}
|
||||
},
|
||||
"implementation_scope": {
|
||||
"methodology": "TDD8",
|
||||
"phases_completed": ["ISSUE", "TEST", "RED", "GREEN", "REFACTOR", "DOCUMENT", "REFINE", "PUBLISH"],
|
||||
"test_files_generated": 1,
|
||||
"total_tests": 18,
|
||||
"functions_added": 9,
|
||||
"cli_flags_added": 2,
|
||||
"pass_rate": "100%",
|
||||
"features": [
|
||||
"publication_directory_management",
|
||||
"single_file_processing",
|
||||
"directory_processing",
|
||||
"environment_variable_support",
|
||||
"cli_integration",
|
||||
"recursive_traversal",
|
||||
"structure_preservation"
|
||||
]
|
||||
},
|
||||
"deliverables": {
|
||||
"modified_files": ["markitect/plugins/builtin/markdown_commands.py"],
|
||||
"new_test_files": ["tests/test_issue_135_publication_directory.py"],
|
||||
"documentation_files": [".markitect_workspace/issue_135/implementation.md", ".markitect_workspace/issue_135/test_plan.md"],
|
||||
"cli_enhancements": ["--use-publication-dir", "--dont-use-publication-dir"],
|
||||
"environment_variables": ["MARKITECT_PUBLICATION_DIR"]
|
||||
}
|
||||
}
|
||||
}
|
||||
-->
|
||||
@@ -8,6 +8,7 @@ replacing the legacy unprefixed commands for better namespace consistency.
|
||||
import click
|
||||
import json
|
||||
import os
|
||||
import re
|
||||
import tempfile
|
||||
from pathlib import Path
|
||||
from typing import Dict, Any
|
||||
@@ -43,7 +44,8 @@ class MarkdownCommandsPlugin(CommandPlugin):
|
||||
'md-ingest': md_ingest_command,
|
||||
'md-get': md_get_command,
|
||||
'md-list': md_list_command,
|
||||
'md-render': md_render_command
|
||||
'md-render': md_render_command,
|
||||
'md-index': md_index_command
|
||||
}
|
||||
|
||||
|
||||
@@ -400,6 +402,81 @@ def md_render_command(ctx, input_file, output, template, css, edit, editor_theme
|
||||
raise click.Abort()
|
||||
|
||||
|
||||
@click.command()
|
||||
@click.argument('directory', type=click.Path(exists=True))
|
||||
@click.option('--output', '-o', type=click.Path(), help='Output index file path (defaults to directory/index.html)')
|
||||
@click.option('--template', type=click.Choice(['basic', 'github', 'academic', 'dark']),
|
||||
default='basic', help='HTML template: basic (default), github, academic, or dark theme')
|
||||
@click.option('--recursive', '-r', is_flag=True, help='Include HTML files from subdirectories')
|
||||
@click.pass_context
|
||||
def md_index_command(ctx, directory, output, template, recursive):
|
||||
"""
|
||||
Generate an index page for HTML files in a directory.
|
||||
|
||||
Creates an HTML index page that lists all HTML files found in the specified
|
||||
directory, providing navigation links to each file. The index page uses the
|
||||
same template system as md-render for consistent styling.
|
||||
|
||||
DIRECTORY: Path to the directory containing HTML files
|
||||
|
||||
Examples:
|
||||
# Generate index for current directory
|
||||
markitect md-index .
|
||||
|
||||
# Generate index with custom output file
|
||||
markitect md-index docs/ --output docs/contents.html
|
||||
|
||||
# Generate index with GitHub template
|
||||
markitect md-index notes/ --template github
|
||||
|
||||
# Include subdirectories recursively
|
||||
markitect md-index docs/ --recursive
|
||||
"""
|
||||
config = ctx.obj or {}
|
||||
try:
|
||||
directory_path = Path(directory)
|
||||
|
||||
if config.get('verbose', False):
|
||||
click.echo(f"Generating index for directory: {directory_path}")
|
||||
|
||||
# Determine output file
|
||||
if output:
|
||||
output_path = Path(output)
|
||||
else:
|
||||
output_path = directory_path / "index.html"
|
||||
|
||||
# Find and filter HTML files
|
||||
html_files = find_html_files(directory_path, recursive=recursive)
|
||||
html_files = [f for f in html_files if f != output_path]
|
||||
|
||||
if config.get('verbose', False):
|
||||
click.echo(f"Found {len(html_files)} HTML file(s)")
|
||||
|
||||
# Prepare file info for template
|
||||
file_infos = _prepare_file_infos(html_files, output_path)
|
||||
|
||||
# Generate and write index HTML
|
||||
directory_name = directory_path.name or "Directory"
|
||||
index_title = f"{directory_name} - Index"
|
||||
index_html = generate_index_html(file_infos, index_title, template)
|
||||
|
||||
# Ensure output directory exists and write file
|
||||
output_path.parent.mkdir(parents=True, exist_ok=True)
|
||||
output_path.write_text(index_html, encoding='utf-8')
|
||||
|
||||
click.echo(f"✓ Index generated: {output_path}")
|
||||
|
||||
if config.get('verbose', False):
|
||||
click.echo(f" Template: {template}")
|
||||
click.echo(f" Files indexed: {len(file_infos)}")
|
||||
if recursive:
|
||||
click.echo(f" Recursive: enabled")
|
||||
|
||||
except Exception as e:
|
||||
click.echo(f"Error generating index: {e}", err=True)
|
||||
raise click.Abort()
|
||||
|
||||
|
||||
def _render_single_markdown_file(input_path, output_path, template, css, edit, editor_theme, keyboard_shortcuts, config):
|
||||
"""Render a single markdown file to HTML."""
|
||||
# Read markdown file
|
||||
@@ -1020,4 +1097,205 @@ def process_directory(input_dir, use_publication_dir, publication_dir):
|
||||
|
||||
output_files.append(output_file)
|
||||
|
||||
return output_files
|
||||
return output_files
|
||||
|
||||
|
||||
# Index generation functions for Issue #136
|
||||
def find_html_files(directory, recursive=False):
|
||||
"""Find all HTML files in a directory."""
|
||||
directory = Path(directory)
|
||||
html_files = []
|
||||
|
||||
if recursive:
|
||||
for pattern in ['*.html', '*.htm']:
|
||||
html_files.extend(directory.rglob(pattern))
|
||||
else:
|
||||
for pattern in ['*.html', '*.htm']:
|
||||
html_files.extend(directory.glob(pattern))
|
||||
|
||||
return sorted(html_files)
|
||||
|
||||
|
||||
# HTML parsing patterns for index generation
|
||||
HTML_TITLE_PATTERN = re.compile(r'<title[^>]*>(.*?)</title>', re.IGNORECASE | re.DOTALL)
|
||||
HTML_H1_PATTERN = re.compile(r'<h1[^>]*>(.*?)</h1>', re.IGNORECASE | re.DOTALL)
|
||||
HTML_TAG_PATTERN = re.compile(r'<[^>]+>')
|
||||
|
||||
|
||||
def extract_html_title(html_file):
|
||||
"""Extract title from HTML file, falling back to H1 tag or filename."""
|
||||
try:
|
||||
content = html_file.read_text(encoding='utf-8')
|
||||
|
||||
# Try to extract from title tag
|
||||
title_match = HTML_TITLE_PATTERN.search(content)
|
||||
if title_match:
|
||||
return title_match.group(1).strip()
|
||||
|
||||
# Try to extract from H1 tag
|
||||
h1_match = HTML_H1_PATTERN.search(content)
|
||||
if h1_match:
|
||||
# Remove HTML tags from H1 content
|
||||
h1_text = HTML_TAG_PATTERN.sub('', h1_match.group(1))
|
||||
return h1_text.strip()
|
||||
|
||||
# Fallback to filename
|
||||
return html_file.stem
|
||||
|
||||
except Exception:
|
||||
# If any error occurs, fallback to filename
|
||||
return html_file.stem
|
||||
|
||||
|
||||
def generate_index_html(html_files, title, template="basic"):
|
||||
"""Generate HTML index page with links to HTML files."""
|
||||
# Get template styles from existing TEMPLATE_STYLES
|
||||
styles = TEMPLATE_STYLES.get(template, TEMPLATE_STYLES['basic'])
|
||||
|
||||
# Generate links list
|
||||
links_html = ""
|
||||
if html_files:
|
||||
links_html = "<ul>\n"
|
||||
for file_info in html_files:
|
||||
relative_path = file_info['relative_path']
|
||||
file_title = file_info['title']
|
||||
links_html += f' <li><a href="{relative_path}">{file_title}</a></li>\n'
|
||||
links_html += " </ul>"
|
||||
else:
|
||||
links_html = "<p>No HTML files found in this directory.</p>"
|
||||
|
||||
# Generate HTML template
|
||||
html_template = '''<!DOCTYPE html>
|
||||
<html lang="en">
|
||||
<head>
|
||||
<meta charset="UTF-8">
|
||||
<meta name="viewport" content="width=device-width, initial-scale=1.0">
|
||||
<title>{title}</title>
|
||||
<style>
|
||||
body {{
|
||||
{body_bg}
|
||||
color: {body_color};
|
||||
font-family: {font_family};
|
||||
line-height: 1.6;
|
||||
max-width: {max_width};
|
||||
margin: 0 auto;
|
||||
padding: 20px;
|
||||
{text_align}
|
||||
}}
|
||||
|
||||
h1 {{
|
||||
color: {heading_color};
|
||||
{heading_border}
|
||||
margin-bottom: 20px;
|
||||
}}
|
||||
|
||||
h2 {{
|
||||
color: {heading_color};
|
||||
margin-top: 30px;
|
||||
margin-bottom: 15px;
|
||||
}}
|
||||
|
||||
ul {{
|
||||
list-style-type: none;
|
||||
padding: 0;
|
||||
}}
|
||||
|
||||
li {{
|
||||
margin: 10px 0;
|
||||
padding: 8px 12px;
|
||||
background: {code_bg};
|
||||
border-radius: 4px;
|
||||
{code_border}
|
||||
}}
|
||||
|
||||
a {{
|
||||
color: {heading_color};
|
||||
text-decoration: none;
|
||||
font-weight: 500;
|
||||
}}
|
||||
|
||||
a:hover {{
|
||||
text-decoration: underline;
|
||||
}}
|
||||
|
||||
.directory-info {{
|
||||
margin-bottom: 20px;
|
||||
padding: 15px;
|
||||
background: {code_bg};
|
||||
border-radius: 8px;
|
||||
border-left: 4px solid {blockquote_border};
|
||||
color: {blockquote_color};
|
||||
}}
|
||||
</style>
|
||||
</head>
|
||||
<body>
|
||||
<h1>{title}</h1>
|
||||
|
||||
<div class="directory-info">
|
||||
<p>📁 Directory Index - Navigate through the available HTML pages</p>
|
||||
</div>
|
||||
|
||||
<h2>Available Pages</h2>
|
||||
{links_html}
|
||||
|
||||
<hr style="margin-top: 40px; border: 1px solid {blockquote_border};">
|
||||
<p style="text-align: center; color: {blockquote_color}; font-size: 0.9em;">
|
||||
Generated with MarkiTect • {file_count} file(s)
|
||||
</p>
|
||||
</body>
|
||||
</html>'''
|
||||
|
||||
return html_template.format(
|
||||
title=title,
|
||||
links_html=links_html,
|
||||
file_count=len(html_files),
|
||||
**styles
|
||||
)
|
||||
|
||||
|
||||
def _prepare_file_infos(html_files, output_path):
|
||||
"""Prepare file information for template generation."""
|
||||
file_infos = []
|
||||
for html_file in html_files:
|
||||
title = extract_html_title(html_file)
|
||||
|
||||
# Calculate relative path from output directory to HTML file
|
||||
try:
|
||||
relative_path = html_file.relative_to(output_path.parent)
|
||||
except ValueError:
|
||||
# If files are in different directory trees, use filename
|
||||
relative_path = html_file.name
|
||||
|
||||
file_infos.append({
|
||||
'path': html_file,
|
||||
'title': title,
|
||||
'relative_path': str(relative_path)
|
||||
})
|
||||
return file_infos
|
||||
|
||||
|
||||
def process_directory_for_index(directory, index_filename="index.html", template="basic", recursive=False):
|
||||
"""Process directory and generate index file."""
|
||||
directory = Path(directory)
|
||||
output_path = directory / index_filename
|
||||
|
||||
if not directory.exists() or not directory.is_dir():
|
||||
raise FileNotFoundError(f"Directory not found: {directory}")
|
||||
|
||||
# Find and filter HTML files
|
||||
html_files = find_html_files(directory, recursive=recursive)
|
||||
html_files = [f for f in html_files if f != output_path]
|
||||
|
||||
# Prepare file info for template
|
||||
file_infos = _prepare_file_infos(html_files, output_path)
|
||||
|
||||
# Generate and write index HTML
|
||||
directory_name = directory.name or "Directory"
|
||||
index_title = f"{directory_name} - Index"
|
||||
index_html = generate_index_html(file_infos, index_title, template)
|
||||
|
||||
# Ensure output directory exists and write file
|
||||
output_path.parent.mkdir(parents=True, exist_ok=True)
|
||||
output_path.write_text(index_html, encoding='utf-8')
|
||||
|
||||
return output_path
|
||||
539
tests/test_issue_136_index_generation.py
Normal file
539
tests/test_issue_136_index_generation.py
Normal file
@@ -0,0 +1,539 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
Test suite for Issue #136: Index page for notes in a directory
|
||||
|
||||
This test suite validates the index page generation functionality for HTML files,
|
||||
including directory scanning, HTML generation, and CLI integration.
|
||||
|
||||
TDD8 Workflow: ISSUE→TEST→RED→GREEN→REFACTOR→DOCUMENT→REFINE→PUBLISH
|
||||
State: RED (Tests should fail initially)
|
||||
"""
|
||||
|
||||
import pytest
|
||||
import tempfile
|
||||
import os
|
||||
import shutil
|
||||
from pathlib import Path
|
||||
from unittest.mock import patch, MagicMock
|
||||
import subprocess
|
||||
import re
|
||||
from html.parser import HTMLParser
|
||||
|
||||
|
||||
class SimpleHTMLParser(HTMLParser):
|
||||
"""Simple HTML parser to extract title and links for testing."""
|
||||
|
||||
def __init__(self):
|
||||
super().__init__()
|
||||
self.title = None
|
||||
self.links = []
|
||||
self.in_title = False
|
||||
|
||||
def handle_starttag(self, tag, attrs):
|
||||
if tag == 'title':
|
||||
self.in_title = True
|
||||
elif tag == 'a':
|
||||
href = dict(attrs).get('href', '')
|
||||
self.links.append({'href': href, 'text': ''})
|
||||
|
||||
def handle_endtag(self, tag):
|
||||
if tag == 'title':
|
||||
self.in_title = False
|
||||
|
||||
def handle_data(self, data):
|
||||
if self.in_title:
|
||||
self.title = data.strip()
|
||||
elif self.links and not self.links[-1]['text']:
|
||||
self.links[-1]['text'] = data.strip()
|
||||
|
||||
|
||||
class TestHTMLFileDiscovery:
|
||||
"""Test HTML file discovery and processing."""
|
||||
|
||||
def setup_method(self):
|
||||
"""Set up test environment with temporary directories and files."""
|
||||
self.temp_dir = tempfile.mkdtemp()
|
||||
self.test_dir = Path(self.temp_dir) / "test_notes"
|
||||
self.test_dir.mkdir()
|
||||
|
||||
# Create test HTML files
|
||||
(self.test_dir / "index.html").write_text("""<!DOCTYPE html>
|
||||
<html>
|
||||
<head><title>Index Page</title></head>
|
||||
<body><h1>Index Page</h1><p>Main index</p></body>
|
||||
</html>""")
|
||||
|
||||
(self.test_dir / "document1.html").write_text("""<!DOCTYPE html>
|
||||
<html>
|
||||
<head><title>Document One</title></head>
|
||||
<body><h1>Document One</h1><p>Content here</p></body>
|
||||
</html>""")
|
||||
|
||||
(self.test_dir / "notes.html").write_text("""<!DOCTYPE html>
|
||||
<html>
|
||||
<head><title>My Notes</title></head>
|
||||
<body><h1>My Notes</h1><p>Note content</p></body>
|
||||
</html>""")
|
||||
|
||||
# Create subdirectory with HTML files
|
||||
sub_dir = self.test_dir / "subdir"
|
||||
sub_dir.mkdir()
|
||||
(sub_dir / "subdoc.html").write_text("""<!DOCTYPE html>
|
||||
<html>
|
||||
<head><title>Sub Document</title></head>
|
||||
<body><h1>Sub Document</h1><p>Sub content</p></body>
|
||||
</html>""")
|
||||
|
||||
# Create non-HTML files (should be ignored)
|
||||
(self.test_dir / "readme.txt").write_text("Not HTML")
|
||||
(self.test_dir / "image.png").write_bytes(b"fake image data")
|
||||
|
||||
def teardown_method(self):
|
||||
"""Clean up test environment."""
|
||||
shutil.rmtree(self.temp_dir)
|
||||
|
||||
def test_find_html_files_in_directory(self):
|
||||
"""Test finding all HTML files in a directory."""
|
||||
from markitect.plugins.builtin.markdown_commands import find_html_files
|
||||
|
||||
html_files = find_html_files(self.test_dir)
|
||||
|
||||
expected_files = [
|
||||
self.test_dir / "index.html",
|
||||
self.test_dir / "document1.html",
|
||||
self.test_dir / "notes.html"
|
||||
]
|
||||
|
||||
assert len(html_files) == 3
|
||||
for expected_file in expected_files:
|
||||
assert expected_file in html_files
|
||||
|
||||
def test_find_html_files_recursively(self):
|
||||
"""Test finding HTML files recursively in subdirectories."""
|
||||
from markitect.plugins.builtin.markdown_commands import find_html_files
|
||||
|
||||
html_files = find_html_files(self.test_dir, recursive=True)
|
||||
|
||||
expected_files = [
|
||||
self.test_dir / "index.html",
|
||||
self.test_dir / "document1.html",
|
||||
self.test_dir / "notes.html",
|
||||
self.test_dir / "subdir" / "subdoc.html"
|
||||
]
|
||||
|
||||
assert len(html_files) == 4
|
||||
for expected_file in expected_files:
|
||||
assert expected_file in html_files
|
||||
|
||||
def test_extract_title_from_html_file(self):
|
||||
"""Test extracting title from HTML file."""
|
||||
from markitect.plugins.builtin.markdown_commands import extract_html_title
|
||||
|
||||
title = extract_html_title(self.test_dir / "document1.html")
|
||||
assert title == "Document One"
|
||||
|
||||
title = extract_html_title(self.test_dir / "notes.html")
|
||||
assert title == "My Notes"
|
||||
|
||||
def test_extract_title_from_h1_if_no_title_tag(self):
|
||||
"""Test extracting title from H1 tag if no title tag exists."""
|
||||
from markitect.plugins.builtin.markdown_commands import extract_html_title
|
||||
|
||||
# Create HTML file without title tag
|
||||
no_title_file = self.test_dir / "no_title.html"
|
||||
no_title_file.write_text("""<!DOCTYPE html>
|
||||
<html>
|
||||
<head></head>
|
||||
<body><h1>Header Title</h1><p>Content</p></body>
|
||||
</html>""")
|
||||
|
||||
title = extract_html_title(no_title_file)
|
||||
assert title == "Header Title"
|
||||
|
||||
def test_extract_title_fallback_to_filename(self):
|
||||
"""Test falling back to filename if no title or H1 found."""
|
||||
from markitect.plugins.builtin.markdown_commands import extract_html_title
|
||||
|
||||
# Create HTML file without title or H1
|
||||
plain_file = self.test_dir / "plain_file.html"
|
||||
plain_file.write_text("""<!DOCTYPE html>
|
||||
<html>
|
||||
<head></head>
|
||||
<body><p>Just content</p></body>
|
||||
</html>""")
|
||||
|
||||
title = extract_html_title(plain_file)
|
||||
assert title == "plain_file"
|
||||
|
||||
|
||||
class TestIndexPageGeneration:
|
||||
"""Test index page HTML generation."""
|
||||
|
||||
def setup_method(self):
|
||||
"""Set up test environment."""
|
||||
self.temp_dir = tempfile.mkdtemp()
|
||||
self.test_dir = Path(self.temp_dir) / "test_notes"
|
||||
self.test_dir.mkdir()
|
||||
|
||||
def teardown_method(self):
|
||||
"""Clean up test environment."""
|
||||
shutil.rmtree(self.temp_dir)
|
||||
|
||||
def test_generate_index_html_structure(self):
|
||||
"""Test generating basic index HTML structure."""
|
||||
from markitect.plugins.builtin.markdown_commands import generate_index_html
|
||||
|
||||
html_files = [
|
||||
{"path": self.test_dir / "doc1.html", "title": "Document One", "relative_path": "doc1.html"},
|
||||
{"path": self.test_dir / "doc2.html", "title": "Document Two", "relative_path": "doc2.html"}
|
||||
]
|
||||
|
||||
html_content = generate_index_html(html_files, "Test Directory Index")
|
||||
|
||||
# Parse HTML to verify structure
|
||||
parser = SimpleHTMLParser()
|
||||
parser.feed(html_content)
|
||||
|
||||
assert parser.title == "Test Directory Index"
|
||||
|
||||
# Check for navigation list
|
||||
assert "<ul>" in html_content
|
||||
assert "<li>" in html_content
|
||||
|
||||
# Check links
|
||||
assert len(parser.links) == 2
|
||||
assert parser.links[0]['href'] == 'doc1.html'
|
||||
assert parser.links[0]['text'] == 'Document One'
|
||||
assert parser.links[1]['href'] == 'doc2.html'
|
||||
assert parser.links[1]['text'] == 'Document Two'
|
||||
|
||||
def test_generate_index_html_with_subdirectories(self):
|
||||
"""Test generating index HTML with subdirectory structure."""
|
||||
from markitect.plugins.builtin.markdown_commands import generate_index_html
|
||||
|
||||
html_files = [
|
||||
{"path": self.test_dir / "doc1.html", "title": "Document One", "relative_path": "doc1.html"},
|
||||
{"path": self.test_dir / "subdir" / "subdoc.html", "title": "Sub Document", "relative_path": "subdir/subdoc.html"}
|
||||
]
|
||||
|
||||
html_content = generate_index_html(html_files, "Test Directory Index")
|
||||
|
||||
parser = SimpleHTMLParser()
|
||||
parser.feed(html_content)
|
||||
|
||||
assert len(parser.links) == 2
|
||||
assert parser.links[0]['href'] == 'doc1.html'
|
||||
assert parser.links[1]['href'] == 'subdir/subdoc.html'
|
||||
|
||||
def test_generate_index_html_with_custom_template(self):
|
||||
"""Test generating index HTML with custom template."""
|
||||
from markitect.plugins.builtin.markdown_commands import generate_index_html
|
||||
|
||||
html_files = [
|
||||
{"path": self.test_dir / "doc1.html", "title": "Document One", "relative_path": "doc1.html"}
|
||||
]
|
||||
|
||||
html_content = generate_index_html(html_files, "Test Index", template="github")
|
||||
|
||||
parser = SimpleHTMLParser()
|
||||
parser.feed(html_content)
|
||||
|
||||
# Should contain the title
|
||||
assert parser.title == "Test Index"
|
||||
|
||||
# Should contain styling (template-specific)
|
||||
assert "<style>" in html_content
|
||||
|
||||
def test_generate_index_html_empty_directory(self):
|
||||
"""Test generating index HTML for empty directory."""
|
||||
from markitect.plugins.builtin.markdown_commands import generate_index_html
|
||||
|
||||
html_content = generate_index_html([], "Empty Directory Index")
|
||||
|
||||
parser = SimpleHTMLParser()
|
||||
parser.feed(html_content)
|
||||
assert parser.title == "Empty Directory Index"
|
||||
|
||||
# Should contain message about no files
|
||||
assert "No HTML files found" in html_content or "No files to display" in html_content
|
||||
|
||||
|
||||
class TestDirectoryProcessing:
|
||||
"""Test directory processing and index generation."""
|
||||
|
||||
def setup_method(self):
|
||||
"""Set up test directory structure."""
|
||||
self.temp_dir = tempfile.mkdtemp()
|
||||
self.test_dir = Path(self.temp_dir) / "notes"
|
||||
self.test_dir.mkdir()
|
||||
|
||||
# Create HTML files
|
||||
self.html_files = [
|
||||
("doc1.html", "Document One"),
|
||||
("doc2.html", "Document Two"),
|
||||
("notes.html", "My Notes")
|
||||
]
|
||||
|
||||
for filename, title in self.html_files:
|
||||
(self.test_dir / filename).write_text(f"""<!DOCTYPE html>
|
||||
<html><head><title>{title}</title></head>
|
||||
<body><h1>{title}</h1></body></html>""")
|
||||
|
||||
def teardown_method(self):
|
||||
"""Clean up test environment."""
|
||||
shutil.rmtree(self.temp_dir)
|
||||
|
||||
def test_process_directory_creates_index_file(self):
|
||||
"""Test that processing a directory creates an index file."""
|
||||
from markitect.plugins.builtin.markdown_commands import process_directory_for_index
|
||||
|
||||
result = process_directory_for_index(self.test_dir)
|
||||
|
||||
# Should return the path to the created index file
|
||||
expected_index_path = self.test_dir / "index.html"
|
||||
assert result == expected_index_path
|
||||
|
||||
# Index file should exist
|
||||
assert expected_index_path.exists()
|
||||
|
||||
def test_process_directory_index_content(self):
|
||||
"""Test that the generated index contains correct content."""
|
||||
from markitect.plugins.builtin.markdown_commands import process_directory_for_index
|
||||
|
||||
index_path = process_directory_for_index(self.test_dir)
|
||||
|
||||
index_content = index_path.read_text()
|
||||
parser = SimpleHTMLParser()
|
||||
parser.feed(index_content)
|
||||
|
||||
# Should contain links to all HTML files
|
||||
link_hrefs = [link['href'] for link in parser.links]
|
||||
link_texts = [link['text'] for link in parser.links]
|
||||
|
||||
assert 'doc1.html' in link_hrefs
|
||||
assert 'doc2.html' in link_hrefs
|
||||
assert 'notes.html' in link_hrefs
|
||||
|
||||
assert 'Document One' in link_texts
|
||||
assert 'Document Two' in link_texts
|
||||
assert 'My Notes' in link_texts
|
||||
|
||||
def test_process_directory_excludes_existing_index(self):
|
||||
"""Test that existing index.html is excluded from the links."""
|
||||
from markitect.plugins.builtin.markdown_commands import process_directory_for_index
|
||||
|
||||
# Create existing index.html
|
||||
(self.test_dir / "index.html").write_text("""<!DOCTYPE html>
|
||||
<html><head><title>Old Index</title></head>
|
||||
<body><h1>Old Index</h1></body></html>""")
|
||||
|
||||
index_path = process_directory_for_index(self.test_dir)
|
||||
index_content = index_path.read_text()
|
||||
parser = SimpleHTMLParser()
|
||||
parser.feed(index_content)
|
||||
|
||||
# Should not contain link to index.html itself
|
||||
link_hrefs = [link['href'] for link in parser.links]
|
||||
|
||||
assert 'index.html' not in link_hrefs
|
||||
|
||||
def test_process_directory_with_custom_index_name(self):
|
||||
"""Test processing directory with custom index filename."""
|
||||
from markitect.plugins.builtin.markdown_commands import process_directory_for_index
|
||||
|
||||
custom_name = "contents.html"
|
||||
result = process_directory_for_index(self.test_dir, index_filename=custom_name)
|
||||
|
||||
expected_path = self.test_dir / custom_name
|
||||
assert result == expected_path
|
||||
assert expected_path.exists()
|
||||
|
||||
|
||||
class TestCLIIntegration:
|
||||
"""Test CLI integration for index generation."""
|
||||
|
||||
def setup_method(self):
|
||||
"""Set up test environment."""
|
||||
self.temp_dir = tempfile.mkdtemp()
|
||||
self.test_dir = Path(self.temp_dir) / "test_notes"
|
||||
self.test_dir.mkdir()
|
||||
|
||||
# Create test HTML file
|
||||
(self.test_dir / "test.html").write_text("""<!DOCTYPE html>
|
||||
<html><head><title>Test Document</title></head>
|
||||
<body><h1>Test</h1></body></html>""")
|
||||
|
||||
def teardown_method(self):
|
||||
"""Clean up test environment."""
|
||||
shutil.rmtree(self.temp_dir)
|
||||
|
||||
def test_md_index_command_exists(self):
|
||||
"""Test that md-index command exists."""
|
||||
result = subprocess.run(
|
||||
["markitect", "md-index", "--help"],
|
||||
capture_output=True,
|
||||
text=True
|
||||
)
|
||||
|
||||
# Should not error (command exists)
|
||||
assert result.returncode == 0
|
||||
assert "md-index" in result.stdout.lower() or "index" in result.stdout.lower()
|
||||
|
||||
def test_md_index_command_processes_directory(self):
|
||||
"""Test that md-index command processes a directory."""
|
||||
result = subprocess.run(
|
||||
["markitect", "md-index", str(self.test_dir)],
|
||||
capture_output=True,
|
||||
text=True,
|
||||
timeout=30
|
||||
)
|
||||
|
||||
# Should succeed
|
||||
assert result.returncode == 0
|
||||
|
||||
# Should create index file
|
||||
index_file = self.test_dir / "index.html"
|
||||
assert index_file.exists()
|
||||
|
||||
def test_md_index_command_with_custom_output(self):
|
||||
"""Test md-index command with custom output filename."""
|
||||
custom_output = self.test_dir / "contents.html"
|
||||
|
||||
result = subprocess.run(
|
||||
["markitect", "md-index", str(self.test_dir), "--output", str(custom_output)],
|
||||
capture_output=True,
|
||||
text=True,
|
||||
timeout=30
|
||||
)
|
||||
|
||||
# Should succeed
|
||||
assert result.returncode == 0
|
||||
assert custom_output.exists()
|
||||
|
||||
def test_md_index_command_with_template_option(self):
|
||||
"""Test md-index command with template option."""
|
||||
result = subprocess.run(
|
||||
["markitect", "md-index", str(self.test_dir), "--template", "github"],
|
||||
capture_output=True,
|
||||
text=True,
|
||||
timeout=30
|
||||
)
|
||||
|
||||
# Should succeed
|
||||
assert result.returncode == 0
|
||||
|
||||
# Generated file should exist
|
||||
index_file = self.test_dir / "index.html"
|
||||
assert index_file.exists()
|
||||
|
||||
def test_md_index_command_help_text(self):
|
||||
"""Test that md-index command has proper help text."""
|
||||
result = subprocess.run(
|
||||
["markitect", "md-index", "--help"],
|
||||
capture_output=True,
|
||||
text=True
|
||||
)
|
||||
|
||||
help_text = result.stdout.lower()
|
||||
assert "index" in help_text
|
||||
assert "directory" in help_text
|
||||
assert "html" in help_text
|
||||
|
||||
|
||||
class TestEdgeCases:
|
||||
"""Test edge cases and error conditions."""
|
||||
|
||||
def setup_method(self):
|
||||
"""Set up test environment."""
|
||||
self.temp_dir = tempfile.mkdtemp()
|
||||
|
||||
def teardown_method(self):
|
||||
"""Clean up test environment."""
|
||||
shutil.rmtree(self.temp_dir)
|
||||
|
||||
def test_empty_directory_processing(self):
|
||||
"""Test processing empty directory."""
|
||||
from markitect.plugins.builtin.markdown_commands import process_directory_for_index
|
||||
|
||||
empty_dir = Path(self.temp_dir) / "empty"
|
||||
empty_dir.mkdir()
|
||||
|
||||
result = process_directory_for_index(empty_dir)
|
||||
|
||||
# Should still create an index file
|
||||
expected_path = empty_dir / "index.html"
|
||||
assert result == expected_path
|
||||
assert expected_path.exists()
|
||||
|
||||
# Should contain appropriate message
|
||||
content = expected_path.read_text()
|
||||
assert "no html files" in content.lower() or "no files found" in content.lower()
|
||||
|
||||
def test_directory_with_no_html_files(self):
|
||||
"""Test processing directory with no HTML files."""
|
||||
from markitect.plugins.builtin.markdown_commands import process_directory_for_index
|
||||
|
||||
dir_with_no_html = Path(self.temp_dir) / "no_html"
|
||||
dir_with_no_html.mkdir()
|
||||
|
||||
# Create non-HTML files
|
||||
(dir_with_no_html / "readme.txt").write_text("Not HTML")
|
||||
(dir_with_no_html / "image.png").write_bytes(b"fake image")
|
||||
|
||||
result = process_directory_for_index(dir_with_no_html)
|
||||
|
||||
# Should create index but with no files message
|
||||
assert result.exists()
|
||||
content = result.read_text()
|
||||
assert "no html files" in content.lower() or "no files found" in content.lower()
|
||||
|
||||
def test_malformed_html_file_handling(self):
|
||||
"""Test handling of malformed HTML files."""
|
||||
from markitect.plugins.builtin.markdown_commands import extract_html_title
|
||||
|
||||
malformed_dir = Path(self.temp_dir) / "malformed"
|
||||
malformed_dir.mkdir()
|
||||
|
||||
# Create malformed HTML file
|
||||
malformed_file = malformed_dir / "malformed.html"
|
||||
malformed_file.write_text("<html><head><title>Incomplete")
|
||||
|
||||
# Should not crash, should fallback to filename
|
||||
title = extract_html_title(malformed_file)
|
||||
assert title == "malformed"
|
||||
|
||||
def test_nonexistent_directory_error(self):
|
||||
"""Test error handling for nonexistent directory."""
|
||||
from markitect.plugins.builtin.markdown_commands import process_directory_for_index
|
||||
|
||||
nonexistent_dir = Path(self.temp_dir) / "nonexistent"
|
||||
|
||||
with pytest.raises(FileNotFoundError):
|
||||
process_directory_for_index(nonexistent_dir)
|
||||
|
||||
def test_file_with_special_characters_in_name(self):
|
||||
"""Test handling files with special characters in names."""
|
||||
from markitect.plugins.builtin.markdown_commands import find_html_files
|
||||
|
||||
special_dir = Path(self.temp_dir) / "special"
|
||||
special_dir.mkdir()
|
||||
|
||||
# Create files with special characters
|
||||
special_files = [
|
||||
"file with spaces.html",
|
||||
"file-with-dashes.html",
|
||||
"file_with_underscores.html",
|
||||
"file&with&ersands.html"
|
||||
]
|
||||
|
||||
for filename in special_files:
|
||||
(special_dir / filename).write_text(f"""<!DOCTYPE html>
|
||||
<html><head><title>{filename}</title></head>
|
||||
<body><h1>Content</h1></body></html>""")
|
||||
|
||||
html_files = find_html_files(special_dir)
|
||||
assert len(html_files) == len(special_files)
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
pytest.main([__file__, "-v"])
|
||||
Reference in New Issue
Block a user