feat: implement index page generation for HTML directories - Issue #136

Complete TDD8 implementation of index page generation functionality:

Core Features:
- HTML file discovery with optional recursive search (find_html_files)
- Smart title extraction from <title>, <h1>, or filename (extract_html_title)
- Template-integrated index page generation (generate_index_html)
- CLI command 'md-index' with output, template, and recursive options
- Comprehensive error handling for edge cases and malformed files

Implementation Details:
- Reuses existing TEMPLATE_STYLES for consistent styling across all templates
- Proper relative path resolution for cross-directory navigation
- Modular design with helper functions for maintainability
- HTML parsing patterns extracted as module-level constants for performance

Tests: 23 comprehensive tests covering discovery, generation, CLI integration, and edge cases
Files: markitect/plugins/builtin/markdown_commands.py, tests/test_issue_136_index_generation.py
Status: All tests passing, full TDD8 cycle completed (RED→GREEN→REFACTOR→DOCUMENT)

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
2025-10-07 13:33:39 +02:00
parent 98fe3361af
commit 3b5d6eecda
4 changed files with 1026 additions and 2 deletions

68
README.html Normal file
View File

@@ -0,0 +1,68 @@
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>README</title>
<style>
body {
font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', 'Roboto', 'Helvetica', 'Arial', sans-serif;
line-height: 1.6;
max-width: 800px;
margin: 0 auto;
padding: 20px;
color: #333;
}
#markdown-content {
margin: 0;
}
h1, h2, h3, h4, h5, h6 {
color: #2c3e50;
}
pre {
background-color: #f4f4f4;
padding: 15px;
border-radius: 5px;
overflow-x: auto;
}
code {
background-color: #f4f4f4;
padding: 2px 4px;
border-radius: 3px;
}
blockquote {
border-left: 4px solid #ddd;
margin: 0;
padding-left: 20px;
color: #666;
}
</style>
</head>
<body>
<div id="markdown-content"></div>
<script src="https://cdn.jsdelivr.net/npm/marked/marked.min.js"></script>
<script>
// Embedded markdown payload
const markdownContent = "MarkiTect - Advanced Markdown Engine\n\nYour Markdown, Redefined.\n\nMarkiTect transforms markdown from plain text into intelligent, structured data with performance optimization, schema validation, and relational querying capabilities. Stop treating documentation as text files\u2014start managing it as a database.\n\n**Key Features:**\n- **Lightning Performance**: 60-85% faster document processing through intelligent AST caching\n- **Schema Validation**: Enforce document structure and consistency\n- **Database Integration**: Query markdown content with SQL-like operations\n- **CLI Tools**: Complete command-line interface for automation and workflows\n\n## \ud83d\udcda Documentation\n\n**Quick Start:** [Getting Started](#getting-started) \u00b7 [Command Reference](docs/user-guides/cache-management.md)\n\n**Architecture:** [Caching System](docs/architecture/caching-system.md) \u00b7 [Performance Philosophy](docs/#performance-philosophy)\n\n**Development:** [TDD Workflow](docs/development/tdd-workflow.md) \u00b7 [Contributing](#contributing)\n\n**Project Status:** [Current Status](history/ProjectStatusDigest.md) \u00b7 [Roadmap](history/ROADMAP.md) \u00b7 [Next Actions](NEXT.md)\n";
const frontMatter = {};
// Render markdown on page load
document.addEventListener('DOMContentLoaded', function() {
if (typeof marked !== 'undefined') {
document.getElementById('markdown-content').innerHTML = marked.parse(markdownContent);
} else {
// Fallback if marked.js fails to load
document.getElementById('markdown-content').innerHTML =
'<pre>' + markdownContent.replace(/</g, '&lt;').replace(/>/g, '&gt;') + '</pre>';
}
});
</script>
</body>
</html>

View File

@@ -0,0 +1,139 @@
---
note_type: "issue_cost_tracking"
issue_id: 135
issue_title: "Instant Markdown base and publication directory"
session_date: "2025-10-07"
claude_model: "claude-sonnet-4"
total_cost_eur: 0.4416
total_cost_usd: 0.48
total_tokens: 63000
generated_at: "2025-10-07T11:30:00.000000"
---
# Issue #135 Implementation Cost
**Issue**: Instant Markdown base and publication directory
**Date**: 2025-10-07
**Claude Model**: claude-sonnet-4
## Cost Summary
- **Total Cost**: €0.4416 ($0.48 USD)
- **Token Usage**: 63,000 tokens
- **Input Tokens**: 38,000 tokens @ $3.00/M
- **Output Tokens**: 25,000 tokens @ $15.00/M
## Cost Breakdown
| Component | Tokens | Rate ($/M) | Cost (USD) | Cost (EUR) |
|-----------|--------|------------|------------|------------|
| Input | 38,000 | $3.00 | $0.1140 | €0.1049 |
| Output | 25,000 | $15.00 | $0.3750 | €0.3450 |
| **Total** | 63,000 | - | $0.4890 | €0.4499 |
## Implementation Summary
Complete TDD8 workflow implementation of instant markdown base and publication directory functionality. Extended md-render command to support both single files and directory processing with publication directory management, environment variable override, and CLI flags for behavior control. Generated comprehensive test suite with 18 tests covering all scenarios from publication directory management to edge cases.
## Key Features Delivered
- Publication directory support with ~/Notes/ default and MARKITECT_PUBLICATION_DIR override
- Single file processing with --use-publication-dir flag
- Directory processing with recursive traversal and structure preservation
- --dont-use-publication-dir flag for placing HTML next to MD files
- Comprehensive CLI integration with detailed help documentation
- 9 new helper functions for directory/file processing
- Full backward compatibility maintained
- Extensive test coverage (18 tests, 100% pass rate)
## Technical Implementation Details
- Modified md-render command with new CLI options and directory support
- Added publication directory management functions (get_publication_directory, normalize_publication_path, ensure_publication_directory)
- Implemented file processing functions (process_single_file, process_directory, find_markdown_files)
- Created utility functions (get_output_filename, get_relative_output_path, _render_single_markdown_file)
- Updated command help documentation with examples and usage patterns
- Comprehensive error handling and edge case management
## Test Coverage Breakdown
1. **Publication Directory Management** (4 tests): Default directory, environment variable override, directory creation, path normalization
2. **Single File Processing** (3 tests): Default behavior, publication directory usage, naming conventions
3. **Directory Processing** (4 tests): Publication directory with structure, HTML next to MD, recursive traversal, structure preservation
4. **CLI Integration** (4 tests): Flag presence, directory input support, environment variable integration
5. **Edge Cases** (3 tests): Empty directories, mixed content, error handling
## Cost Allocation
This cost has been allocated to the 'AI & ML Services' category as a one-time expense for issue #135 implementation using full TDD8 methodology including requirements analysis, test design, implementation, refactoring, documentation, and integration.
## Implementation Methodology
- **TDD8 Workflow**: Complete ISSUE→TEST→RED→GREEN→REFACTOR→DOCUMENT→REFINE→PUBLISH cycle
- **Test-Driven Approach**: 18 tests written first (RED state), then implementation (GREEN state)
- **Code Quality**: Refactoring phase ensured clean, maintainable code
- **Documentation**: Comprehensive implementation and test plan documentation
- **Integration**: Full CLI integration with help text and examples
## Performance Metrics
- **Development Time**: Full TDD8 cycle implementation
- **Test Success Rate**: 18/18 tests passing (100%)
- **Code Quality**: Clean, well-documented, modular implementation
- **CLI Integration**: Complete with comprehensive help documentation
- **Backward Compatibility**: Maintained for all existing functionality
## Notes
- Currency conversion rate: 1 USD = 0.920 EUR
- Pricing based on claude-sonnet-4 rates as of 2025-10-07
- Token counts estimated based on comprehensive TDD8 implementation session
- Includes requirements engineering, test generation, RED-GREEN-REFACTOR cycle, documentation, and final integration
- Higher token count than typical due to extensive directory processing logic and comprehensive test coverage
<!--
contentmatter:
{
"cost_tracking": {
"issue": {
"id": 135,
"title": "Instant Markdown base and publication directory",
"implementation_date": "2025-10-07"
},
"session": {
"model": "claude-sonnet-4",
"token_usage": {
"input_tokens": 38000,
"output_tokens": 25000,
"total_tokens": 63000
},
"costs": {
"input_cost_usd": 0.114,
"output_cost_usd": 0.375,
"total_cost_usd": 0.489,
"total_cost_eur": 0.4499,
"conversion_rate": 0.92
},
"pricing_rates": {
"input_per_million": 3.0,
"output_per_million": 15.0
}
},
"implementation_scope": {
"methodology": "TDD8",
"phases_completed": ["ISSUE", "TEST", "RED", "GREEN", "REFACTOR", "DOCUMENT", "REFINE", "PUBLISH"],
"test_files_generated": 1,
"total_tests": 18,
"functions_added": 9,
"cli_flags_added": 2,
"pass_rate": "100%",
"features": [
"publication_directory_management",
"single_file_processing",
"directory_processing",
"environment_variable_support",
"cli_integration",
"recursive_traversal",
"structure_preservation"
]
},
"deliverables": {
"modified_files": ["markitect/plugins/builtin/markdown_commands.py"],
"new_test_files": ["tests/test_issue_135_publication_directory.py"],
"documentation_files": [".markitect_workspace/issue_135/implementation.md", ".markitect_workspace/issue_135/test_plan.md"],
"cli_enhancements": ["--use-publication-dir", "--dont-use-publication-dir"],
"environment_variables": ["MARKITECT_PUBLICATION_DIR"]
}
}
}
-->

View File

@@ -8,6 +8,7 @@ replacing the legacy unprefixed commands for better namespace consistency.
import click
import json
import os
import re
import tempfile
from pathlib import Path
from typing import Dict, Any
@@ -43,7 +44,8 @@ class MarkdownCommandsPlugin(CommandPlugin):
'md-ingest': md_ingest_command,
'md-get': md_get_command,
'md-list': md_list_command,
'md-render': md_render_command
'md-render': md_render_command,
'md-index': md_index_command
}
@@ -400,6 +402,81 @@ def md_render_command(ctx, input_file, output, template, css, edit, editor_theme
raise click.Abort()
@click.command()
@click.argument('directory', type=click.Path(exists=True))
@click.option('--output', '-o', type=click.Path(), help='Output index file path (defaults to directory/index.html)')
@click.option('--template', type=click.Choice(['basic', 'github', 'academic', 'dark']),
default='basic', help='HTML template: basic (default), github, academic, or dark theme')
@click.option('--recursive', '-r', is_flag=True, help='Include HTML files from subdirectories')
@click.pass_context
def md_index_command(ctx, directory, output, template, recursive):
"""
Generate an index page for HTML files in a directory.
Creates an HTML index page that lists all HTML files found in the specified
directory, providing navigation links to each file. The index page uses the
same template system as md-render for consistent styling.
DIRECTORY: Path to the directory containing HTML files
Examples:
# Generate index for current directory
markitect md-index .
# Generate index with custom output file
markitect md-index docs/ --output docs/contents.html
# Generate index with GitHub template
markitect md-index notes/ --template github
# Include subdirectories recursively
markitect md-index docs/ --recursive
"""
config = ctx.obj or {}
try:
directory_path = Path(directory)
if config.get('verbose', False):
click.echo(f"Generating index for directory: {directory_path}")
# Determine output file
if output:
output_path = Path(output)
else:
output_path = directory_path / "index.html"
# Find and filter HTML files
html_files = find_html_files(directory_path, recursive=recursive)
html_files = [f for f in html_files if f != output_path]
if config.get('verbose', False):
click.echo(f"Found {len(html_files)} HTML file(s)")
# Prepare file info for template
file_infos = _prepare_file_infos(html_files, output_path)
# Generate and write index HTML
directory_name = directory_path.name or "Directory"
index_title = f"{directory_name} - Index"
index_html = generate_index_html(file_infos, index_title, template)
# Ensure output directory exists and write file
output_path.parent.mkdir(parents=True, exist_ok=True)
output_path.write_text(index_html, encoding='utf-8')
click.echo(f"✓ Index generated: {output_path}")
if config.get('verbose', False):
click.echo(f" Template: {template}")
click.echo(f" Files indexed: {len(file_infos)}")
if recursive:
click.echo(f" Recursive: enabled")
except Exception as e:
click.echo(f"Error generating index: {e}", err=True)
raise click.Abort()
def _render_single_markdown_file(input_path, output_path, template, css, edit, editor_theme, keyboard_shortcuts, config):
"""Render a single markdown file to HTML."""
# Read markdown file
@@ -1020,4 +1097,205 @@ def process_directory(input_dir, use_publication_dir, publication_dir):
output_files.append(output_file)
return output_files
return output_files
# Index generation functions for Issue #136
def find_html_files(directory, recursive=False):
"""Find all HTML files in a directory."""
directory = Path(directory)
html_files = []
if recursive:
for pattern in ['*.html', '*.htm']:
html_files.extend(directory.rglob(pattern))
else:
for pattern in ['*.html', '*.htm']:
html_files.extend(directory.glob(pattern))
return sorted(html_files)
# HTML parsing patterns for index generation
HTML_TITLE_PATTERN = re.compile(r'<title[^>]*>(.*?)</title>', re.IGNORECASE | re.DOTALL)
HTML_H1_PATTERN = re.compile(r'<h1[^>]*>(.*?)</h1>', re.IGNORECASE | re.DOTALL)
HTML_TAG_PATTERN = re.compile(r'<[^>]+>')
def extract_html_title(html_file):
"""Extract title from HTML file, falling back to H1 tag or filename."""
try:
content = html_file.read_text(encoding='utf-8')
# Try to extract from title tag
title_match = HTML_TITLE_PATTERN.search(content)
if title_match:
return title_match.group(1).strip()
# Try to extract from H1 tag
h1_match = HTML_H1_PATTERN.search(content)
if h1_match:
# Remove HTML tags from H1 content
h1_text = HTML_TAG_PATTERN.sub('', h1_match.group(1))
return h1_text.strip()
# Fallback to filename
return html_file.stem
except Exception:
# If any error occurs, fallback to filename
return html_file.stem
def generate_index_html(html_files, title, template="basic"):
"""Generate HTML index page with links to HTML files."""
# Get template styles from existing TEMPLATE_STYLES
styles = TEMPLATE_STYLES.get(template, TEMPLATE_STYLES['basic'])
# Generate links list
links_html = ""
if html_files:
links_html = "<ul>\n"
for file_info in html_files:
relative_path = file_info['relative_path']
file_title = file_info['title']
links_html += f' <li><a href="{relative_path}">{file_title}</a></li>\n'
links_html += " </ul>"
else:
links_html = "<p>No HTML files found in this directory.</p>"
# Generate HTML template
html_template = '''<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>{title}</title>
<style>
body {{
{body_bg}
color: {body_color};
font-family: {font_family};
line-height: 1.6;
max-width: {max_width};
margin: 0 auto;
padding: 20px;
{text_align}
}}
h1 {{
color: {heading_color};
{heading_border}
margin-bottom: 20px;
}}
h2 {{
color: {heading_color};
margin-top: 30px;
margin-bottom: 15px;
}}
ul {{
list-style-type: none;
padding: 0;
}}
li {{
margin: 10px 0;
padding: 8px 12px;
background: {code_bg};
border-radius: 4px;
{code_border}
}}
a {{
color: {heading_color};
text-decoration: none;
font-weight: 500;
}}
a:hover {{
text-decoration: underline;
}}
.directory-info {{
margin-bottom: 20px;
padding: 15px;
background: {code_bg};
border-radius: 8px;
border-left: 4px solid {blockquote_border};
color: {blockquote_color};
}}
</style>
</head>
<body>
<h1>{title}</h1>
<div class="directory-info">
<p>📁 Directory Index - Navigate through the available HTML pages</p>
</div>
<h2>Available Pages</h2>
{links_html}
<hr style="margin-top: 40px; border: 1px solid {blockquote_border};">
<p style="text-align: center; color: {blockquote_color}; font-size: 0.9em;">
Generated with MarkiTect • {file_count} file(s)
</p>
</body>
</html>'''
return html_template.format(
title=title,
links_html=links_html,
file_count=len(html_files),
**styles
)
def _prepare_file_infos(html_files, output_path):
"""Prepare file information for template generation."""
file_infos = []
for html_file in html_files:
title = extract_html_title(html_file)
# Calculate relative path from output directory to HTML file
try:
relative_path = html_file.relative_to(output_path.parent)
except ValueError:
# If files are in different directory trees, use filename
relative_path = html_file.name
file_infos.append({
'path': html_file,
'title': title,
'relative_path': str(relative_path)
})
return file_infos
def process_directory_for_index(directory, index_filename="index.html", template="basic", recursive=False):
"""Process directory and generate index file."""
directory = Path(directory)
output_path = directory / index_filename
if not directory.exists() or not directory.is_dir():
raise FileNotFoundError(f"Directory not found: {directory}")
# Find and filter HTML files
html_files = find_html_files(directory, recursive=recursive)
html_files = [f for f in html_files if f != output_path]
# Prepare file info for template
file_infos = _prepare_file_infos(html_files, output_path)
# Generate and write index HTML
directory_name = directory.name or "Directory"
index_title = f"{directory_name} - Index"
index_html = generate_index_html(file_infos, index_title, template)
# Ensure output directory exists and write file
output_path.parent.mkdir(parents=True, exist_ok=True)
output_path.write_text(index_html, encoding='utf-8')
return output_path

View File

@@ -0,0 +1,539 @@
#!/usr/bin/env python3
"""
Test suite for Issue #136: Index page for notes in a directory
This test suite validates the index page generation functionality for HTML files,
including directory scanning, HTML generation, and CLI integration.
TDD8 Workflow: ISSUE→TEST→RED→GREEN→REFACTOR→DOCUMENT→REFINE→PUBLISH
State: RED (Tests should fail initially)
"""
import pytest
import tempfile
import os
import shutil
from pathlib import Path
from unittest.mock import patch, MagicMock
import subprocess
import re
from html.parser import HTMLParser
class SimpleHTMLParser(HTMLParser):
"""Simple HTML parser to extract title and links for testing."""
def __init__(self):
super().__init__()
self.title = None
self.links = []
self.in_title = False
def handle_starttag(self, tag, attrs):
if tag == 'title':
self.in_title = True
elif tag == 'a':
href = dict(attrs).get('href', '')
self.links.append({'href': href, 'text': ''})
def handle_endtag(self, tag):
if tag == 'title':
self.in_title = False
def handle_data(self, data):
if self.in_title:
self.title = data.strip()
elif self.links and not self.links[-1]['text']:
self.links[-1]['text'] = data.strip()
class TestHTMLFileDiscovery:
"""Test HTML file discovery and processing."""
def setup_method(self):
"""Set up test environment with temporary directories and files."""
self.temp_dir = tempfile.mkdtemp()
self.test_dir = Path(self.temp_dir) / "test_notes"
self.test_dir.mkdir()
# Create test HTML files
(self.test_dir / "index.html").write_text("""<!DOCTYPE html>
<html>
<head><title>Index Page</title></head>
<body><h1>Index Page</h1><p>Main index</p></body>
</html>""")
(self.test_dir / "document1.html").write_text("""<!DOCTYPE html>
<html>
<head><title>Document One</title></head>
<body><h1>Document One</h1><p>Content here</p></body>
</html>""")
(self.test_dir / "notes.html").write_text("""<!DOCTYPE html>
<html>
<head><title>My Notes</title></head>
<body><h1>My Notes</h1><p>Note content</p></body>
</html>""")
# Create subdirectory with HTML files
sub_dir = self.test_dir / "subdir"
sub_dir.mkdir()
(sub_dir / "subdoc.html").write_text("""<!DOCTYPE html>
<html>
<head><title>Sub Document</title></head>
<body><h1>Sub Document</h1><p>Sub content</p></body>
</html>""")
# Create non-HTML files (should be ignored)
(self.test_dir / "readme.txt").write_text("Not HTML")
(self.test_dir / "image.png").write_bytes(b"fake image data")
def teardown_method(self):
"""Clean up test environment."""
shutil.rmtree(self.temp_dir)
def test_find_html_files_in_directory(self):
"""Test finding all HTML files in a directory."""
from markitect.plugins.builtin.markdown_commands import find_html_files
html_files = find_html_files(self.test_dir)
expected_files = [
self.test_dir / "index.html",
self.test_dir / "document1.html",
self.test_dir / "notes.html"
]
assert len(html_files) == 3
for expected_file in expected_files:
assert expected_file in html_files
def test_find_html_files_recursively(self):
"""Test finding HTML files recursively in subdirectories."""
from markitect.plugins.builtin.markdown_commands import find_html_files
html_files = find_html_files(self.test_dir, recursive=True)
expected_files = [
self.test_dir / "index.html",
self.test_dir / "document1.html",
self.test_dir / "notes.html",
self.test_dir / "subdir" / "subdoc.html"
]
assert len(html_files) == 4
for expected_file in expected_files:
assert expected_file in html_files
def test_extract_title_from_html_file(self):
"""Test extracting title from HTML file."""
from markitect.plugins.builtin.markdown_commands import extract_html_title
title = extract_html_title(self.test_dir / "document1.html")
assert title == "Document One"
title = extract_html_title(self.test_dir / "notes.html")
assert title == "My Notes"
def test_extract_title_from_h1_if_no_title_tag(self):
"""Test extracting title from H1 tag if no title tag exists."""
from markitect.plugins.builtin.markdown_commands import extract_html_title
# Create HTML file without title tag
no_title_file = self.test_dir / "no_title.html"
no_title_file.write_text("""<!DOCTYPE html>
<html>
<head></head>
<body><h1>Header Title</h1><p>Content</p></body>
</html>""")
title = extract_html_title(no_title_file)
assert title == "Header Title"
def test_extract_title_fallback_to_filename(self):
"""Test falling back to filename if no title or H1 found."""
from markitect.plugins.builtin.markdown_commands import extract_html_title
# Create HTML file without title or H1
plain_file = self.test_dir / "plain_file.html"
plain_file.write_text("""<!DOCTYPE html>
<html>
<head></head>
<body><p>Just content</p></body>
</html>""")
title = extract_html_title(plain_file)
assert title == "plain_file"
class TestIndexPageGeneration:
"""Test index page HTML generation."""
def setup_method(self):
"""Set up test environment."""
self.temp_dir = tempfile.mkdtemp()
self.test_dir = Path(self.temp_dir) / "test_notes"
self.test_dir.mkdir()
def teardown_method(self):
"""Clean up test environment."""
shutil.rmtree(self.temp_dir)
def test_generate_index_html_structure(self):
"""Test generating basic index HTML structure."""
from markitect.plugins.builtin.markdown_commands import generate_index_html
html_files = [
{"path": self.test_dir / "doc1.html", "title": "Document One", "relative_path": "doc1.html"},
{"path": self.test_dir / "doc2.html", "title": "Document Two", "relative_path": "doc2.html"}
]
html_content = generate_index_html(html_files, "Test Directory Index")
# Parse HTML to verify structure
parser = SimpleHTMLParser()
parser.feed(html_content)
assert parser.title == "Test Directory Index"
# Check for navigation list
assert "<ul>" in html_content
assert "<li>" in html_content
# Check links
assert len(parser.links) == 2
assert parser.links[0]['href'] == 'doc1.html'
assert parser.links[0]['text'] == 'Document One'
assert parser.links[1]['href'] == 'doc2.html'
assert parser.links[1]['text'] == 'Document Two'
def test_generate_index_html_with_subdirectories(self):
"""Test generating index HTML with subdirectory structure."""
from markitect.plugins.builtin.markdown_commands import generate_index_html
html_files = [
{"path": self.test_dir / "doc1.html", "title": "Document One", "relative_path": "doc1.html"},
{"path": self.test_dir / "subdir" / "subdoc.html", "title": "Sub Document", "relative_path": "subdir/subdoc.html"}
]
html_content = generate_index_html(html_files, "Test Directory Index")
parser = SimpleHTMLParser()
parser.feed(html_content)
assert len(parser.links) == 2
assert parser.links[0]['href'] == 'doc1.html'
assert parser.links[1]['href'] == 'subdir/subdoc.html'
def test_generate_index_html_with_custom_template(self):
"""Test generating index HTML with custom template."""
from markitect.plugins.builtin.markdown_commands import generate_index_html
html_files = [
{"path": self.test_dir / "doc1.html", "title": "Document One", "relative_path": "doc1.html"}
]
html_content = generate_index_html(html_files, "Test Index", template="github")
parser = SimpleHTMLParser()
parser.feed(html_content)
# Should contain the title
assert parser.title == "Test Index"
# Should contain styling (template-specific)
assert "<style>" in html_content
def test_generate_index_html_empty_directory(self):
"""Test generating index HTML for empty directory."""
from markitect.plugins.builtin.markdown_commands import generate_index_html
html_content = generate_index_html([], "Empty Directory Index")
parser = SimpleHTMLParser()
parser.feed(html_content)
assert parser.title == "Empty Directory Index"
# Should contain message about no files
assert "No HTML files found" in html_content or "No files to display" in html_content
class TestDirectoryProcessing:
"""Test directory processing and index generation."""
def setup_method(self):
"""Set up test directory structure."""
self.temp_dir = tempfile.mkdtemp()
self.test_dir = Path(self.temp_dir) / "notes"
self.test_dir.mkdir()
# Create HTML files
self.html_files = [
("doc1.html", "Document One"),
("doc2.html", "Document Two"),
("notes.html", "My Notes")
]
for filename, title in self.html_files:
(self.test_dir / filename).write_text(f"""<!DOCTYPE html>
<html><head><title>{title}</title></head>
<body><h1>{title}</h1></body></html>""")
def teardown_method(self):
"""Clean up test environment."""
shutil.rmtree(self.temp_dir)
def test_process_directory_creates_index_file(self):
"""Test that processing a directory creates an index file."""
from markitect.plugins.builtin.markdown_commands import process_directory_for_index
result = process_directory_for_index(self.test_dir)
# Should return the path to the created index file
expected_index_path = self.test_dir / "index.html"
assert result == expected_index_path
# Index file should exist
assert expected_index_path.exists()
def test_process_directory_index_content(self):
"""Test that the generated index contains correct content."""
from markitect.plugins.builtin.markdown_commands import process_directory_for_index
index_path = process_directory_for_index(self.test_dir)
index_content = index_path.read_text()
parser = SimpleHTMLParser()
parser.feed(index_content)
# Should contain links to all HTML files
link_hrefs = [link['href'] for link in parser.links]
link_texts = [link['text'] for link in parser.links]
assert 'doc1.html' in link_hrefs
assert 'doc2.html' in link_hrefs
assert 'notes.html' in link_hrefs
assert 'Document One' in link_texts
assert 'Document Two' in link_texts
assert 'My Notes' in link_texts
def test_process_directory_excludes_existing_index(self):
"""Test that existing index.html is excluded from the links."""
from markitect.plugins.builtin.markdown_commands import process_directory_for_index
# Create existing index.html
(self.test_dir / "index.html").write_text("""<!DOCTYPE html>
<html><head><title>Old Index</title></head>
<body><h1>Old Index</h1></body></html>""")
index_path = process_directory_for_index(self.test_dir)
index_content = index_path.read_text()
parser = SimpleHTMLParser()
parser.feed(index_content)
# Should not contain link to index.html itself
link_hrefs = [link['href'] for link in parser.links]
assert 'index.html' not in link_hrefs
def test_process_directory_with_custom_index_name(self):
"""Test processing directory with custom index filename."""
from markitect.plugins.builtin.markdown_commands import process_directory_for_index
custom_name = "contents.html"
result = process_directory_for_index(self.test_dir, index_filename=custom_name)
expected_path = self.test_dir / custom_name
assert result == expected_path
assert expected_path.exists()
class TestCLIIntegration:
"""Test CLI integration for index generation."""
def setup_method(self):
"""Set up test environment."""
self.temp_dir = tempfile.mkdtemp()
self.test_dir = Path(self.temp_dir) / "test_notes"
self.test_dir.mkdir()
# Create test HTML file
(self.test_dir / "test.html").write_text("""<!DOCTYPE html>
<html><head><title>Test Document</title></head>
<body><h1>Test</h1></body></html>""")
def teardown_method(self):
"""Clean up test environment."""
shutil.rmtree(self.temp_dir)
def test_md_index_command_exists(self):
"""Test that md-index command exists."""
result = subprocess.run(
["markitect", "md-index", "--help"],
capture_output=True,
text=True
)
# Should not error (command exists)
assert result.returncode == 0
assert "md-index" in result.stdout.lower() or "index" in result.stdout.lower()
def test_md_index_command_processes_directory(self):
"""Test that md-index command processes a directory."""
result = subprocess.run(
["markitect", "md-index", str(self.test_dir)],
capture_output=True,
text=True,
timeout=30
)
# Should succeed
assert result.returncode == 0
# Should create index file
index_file = self.test_dir / "index.html"
assert index_file.exists()
def test_md_index_command_with_custom_output(self):
"""Test md-index command with custom output filename."""
custom_output = self.test_dir / "contents.html"
result = subprocess.run(
["markitect", "md-index", str(self.test_dir), "--output", str(custom_output)],
capture_output=True,
text=True,
timeout=30
)
# Should succeed
assert result.returncode == 0
assert custom_output.exists()
def test_md_index_command_with_template_option(self):
"""Test md-index command with template option."""
result = subprocess.run(
["markitect", "md-index", str(self.test_dir), "--template", "github"],
capture_output=True,
text=True,
timeout=30
)
# Should succeed
assert result.returncode == 0
# Generated file should exist
index_file = self.test_dir / "index.html"
assert index_file.exists()
def test_md_index_command_help_text(self):
"""Test that md-index command has proper help text."""
result = subprocess.run(
["markitect", "md-index", "--help"],
capture_output=True,
text=True
)
help_text = result.stdout.lower()
assert "index" in help_text
assert "directory" in help_text
assert "html" in help_text
class TestEdgeCases:
"""Test edge cases and error conditions."""
def setup_method(self):
"""Set up test environment."""
self.temp_dir = tempfile.mkdtemp()
def teardown_method(self):
"""Clean up test environment."""
shutil.rmtree(self.temp_dir)
def test_empty_directory_processing(self):
"""Test processing empty directory."""
from markitect.plugins.builtin.markdown_commands import process_directory_for_index
empty_dir = Path(self.temp_dir) / "empty"
empty_dir.mkdir()
result = process_directory_for_index(empty_dir)
# Should still create an index file
expected_path = empty_dir / "index.html"
assert result == expected_path
assert expected_path.exists()
# Should contain appropriate message
content = expected_path.read_text()
assert "no html files" in content.lower() or "no files found" in content.lower()
def test_directory_with_no_html_files(self):
"""Test processing directory with no HTML files."""
from markitect.plugins.builtin.markdown_commands import process_directory_for_index
dir_with_no_html = Path(self.temp_dir) / "no_html"
dir_with_no_html.mkdir()
# Create non-HTML files
(dir_with_no_html / "readme.txt").write_text("Not HTML")
(dir_with_no_html / "image.png").write_bytes(b"fake image")
result = process_directory_for_index(dir_with_no_html)
# Should create index but with no files message
assert result.exists()
content = result.read_text()
assert "no html files" in content.lower() or "no files found" in content.lower()
def test_malformed_html_file_handling(self):
"""Test handling of malformed HTML files."""
from markitect.plugins.builtin.markdown_commands import extract_html_title
malformed_dir = Path(self.temp_dir) / "malformed"
malformed_dir.mkdir()
# Create malformed HTML file
malformed_file = malformed_dir / "malformed.html"
malformed_file.write_text("<html><head><title>Incomplete")
# Should not crash, should fallback to filename
title = extract_html_title(malformed_file)
assert title == "malformed"
def test_nonexistent_directory_error(self):
"""Test error handling for nonexistent directory."""
from markitect.plugins.builtin.markdown_commands import process_directory_for_index
nonexistent_dir = Path(self.temp_dir) / "nonexistent"
with pytest.raises(FileNotFoundError):
process_directory_for_index(nonexistent_dir)
def test_file_with_special_characters_in_name(self):
"""Test handling files with special characters in names."""
from markitect.plugins.builtin.markdown_commands import find_html_files
special_dir = Path(self.temp_dir) / "special"
special_dir.mkdir()
# Create files with special characters
special_files = [
"file with spaces.html",
"file-with-dashes.html",
"file_with_underscores.html",
"file&with&ampersands.html"
]
for filename in special_files:
(special_dir / filename).write_text(f"""<!DOCTYPE html>
<html><head><title>{filename}</title></head>
<body><h1>Content</h1></body></html>""")
html_files = find_html_files(special_dir)
assert len(html_files) == len(special_files)
if __name__ == '__main__':
pytest.main([__file__, "-v"])