Files
markitect-main/tests/test_issue_144_asset_optimization.py
tegwick c55a10170f feat: complete Issue #144 - Phase 3: Advanced Features and Performance
Implements comprehensive advanced asset management features using TDD8 methodology,
building upon the solid foundation from Issues #142 and #143.

🚀 **Complete TDD8 Implementation:**
-  ISSUE: Clear requirements defined for advanced features
-  TEST: 36+ comprehensive tests across 5 test categories
-  RED: All tests failed appropriately guiding implementation
-  GREEN: Complete implementation passing all tests
-  REFACTOR: 350+ lines of reusable utilities extracted
-  DOCUMENT: Comprehensive docstrings and API documentation
-  REFINE: Integration testing with zero regressions
-  PUBLISH: Production-ready advanced asset management

🎯 **Advanced Features Delivered:**

**Batch Processing (BatchAssetProcessor):**
- Multi-file import with progress reporting and conflict resolution
- Recursive directory scanning with file filtering
- Parallel processing support for large operations
- Comprehensive error handling and recovery

**Asset Discovery (AssetDiscoveryEngine):**
- Automatic asset discovery in markdown documents
- Reference tracking and dependency analysis
- Cross-document asset relationship mapping
- Smart asset scanning with pattern recognition

**Performance Monitoring (PerformanceMonitor):**
- Real-time operation tracking with detailed metrics
- Query optimization and performance analysis
- Slowest operation identification and reporting
- Context-aware performance measurement

**Database Enhancements (AssetDatabase):**
- Enhanced metadata storage with migration support
- Performance optimizations for large asset libraries
- Advanced querying capabilities with indexing
- Schema evolution and backward compatibility

**Caching System (AssetCache):**
- Multi-strategy caching (LRU, TTL, size-based)
- Configurable cache policies and expiration
- Memory-efficient asset metadata caching
- Performance boost for repeated operations

**Content Analysis (ContentAnalyzer):**
- Asset similarity detection and duplicate identification
- Content-based analysis and classification
- Metadata extraction and enhancement
- Smart asset organization suggestions

**Optimization Engine (AssetOptimizer):**
- Asset optimization with multiple profiles
- Image compression and format conversion
- File size reduction with quality preservation
- Batch optimization workflows

**Analytics & Reporting (AssetAnalytics):**
- Usage analytics and reporting
- Storage efficiency analysis
- Asset utilization tracking
- Performance trend analysis

🛠️ **Technical Excellence:**
- **9 new core modules** with comprehensive functionality
- **350+ lines of utilities** for code reuse and maintainability
- **Backward compatibility** with enhanced AssetManager
- **Performance optimized** for sub-second operations
- **Production-ready** error handling and logging

🧪 **Quality Metrics:**
- **36+ tests passing** across all advanced features
- **Zero regressions** in existing asset management functionality
- **Comprehensive integration** with Issues #142-143 foundation
- **Professional documentation** with usage examples

**CLI Integration:**
- Seamless integration with existing asset CLI commands
- Advanced features accessible through enhanced AssetManager API
- Performance monitoring available for all operations
- Batch processing ready for CLI workflow integration

This implementation transforms MarkiTect's asset management from basic functionality
into a comprehensive, enterprise-ready system with advanced performance, analytics,
and optimization capabilities.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-14 17:53:47 +02:00

368 lines
14 KiB
Python

"""
Test scenario for Issue #144: Advanced Asset Processing and Optimization
This test covers format optimization, asset transformation, content analysis,
and similarity detection features.
Issue #144: Phase 3 - Advanced Features and Performance
"""
import pytest
import tempfile
import shutil
from pathlib import Path
from unittest.mock import Mock, patch, MagicMock
import json
from PIL import Image
import io
from markitect.assets import AssetManager
from markitect.assets.optimizer import AssetOptimizer, OptimizationProfile, OptimizationResult
from markitect.assets.transformer import AssetTransformer, ThumbnailGenerator
from markitect.assets.analyzer import ContentAnalyzer, SimilarityDetector, AssetMetrics
class TestAssetOptimizationAndProcessing:
"""Test advanced asset processing and optimization for Issue #144."""
def setup_method(self):
"""Set up test environment with sample assets."""
self.temp_dir = tempfile.mkdtemp()
self.assets_dir = Path(self.temp_dir) / "assets"
self.test_files_dir = Path(self.temp_dir) / "test_files"
self.assets_dir.mkdir()
self.test_files_dir.mkdir()
# Create sample image data
self.create_test_images()
self.create_test_documents()
self.asset_manager = AssetManager(storage_path=self.assets_dir)
def teardown_method(self):
"""Clean up temporary directories."""
shutil.rmtree(self.temp_dir)
def create_test_images(self):
"""Create test images with various properties."""
# Large PNG image
large_image = Image.new('RGB', (2000, 1500), color='red')
large_png_path = self.test_files_dir / "large_image.png"
large_image.save(large_png_path, 'PNG')
# High quality JPEG
high_quality_image = Image.new('RGB', (1200, 800), color='blue')
high_jpeg_path = self.test_files_dir / "high_quality.jpg"
high_quality_image.save(high_jpeg_path, 'JPEG', quality=95)
# SVG content
svg_content = '''
<svg width="100" height="100" xmlns="http://www.w3.org/2000/svg">
<circle cx="50" cy="50" r="40" fill="green" />
<!-- This is a comment that could be removed -->
<rect x="10" y="10" width="20" height="20" fill="yellow" />
</svg>
'''
svg_path = self.test_files_dir / "diagram.svg"
svg_path.write_text(svg_content)
def create_test_documents(self):
"""Create test document files."""
# Simple PDF placeholder (would be real PDF in production)
pdf_path = self.test_files_dir / "document.pdf"
pdf_path.write_bytes(b"%PDF-1.4 mock pdf content")
# Text document
text_path = self.test_files_dir / "document.txt"
text_path.write_text("This is a sample text document with content.")
def test_asset_optimizer_initialization(self):
"""Test AssetOptimizer initialization with different profiles."""
# Default profile
optimizer = AssetOptimizer()
assert optimizer.profile == OptimizationProfile.BALANCED
# Custom profile
custom_profile = OptimizationProfile.AGGRESSIVE
optimizer_aggressive = AssetOptimizer(profile=custom_profile)
assert optimizer_aggressive.profile == OptimizationProfile.AGGRESSIVE
def test_image_compression_optimization(self):
"""Test automatic image compression and format conversion."""
optimizer = AssetOptimizer(profile=OptimizationProfile.AGGRESSIVE)
# Test PNG optimization
png_path = self.test_files_dir / "large_image.png"
result = optimizer.optimize_image(png_path)
assert isinstance(result, OptimizationResult)
assert result.original_size > result.optimized_size
assert result.size_reduction_percent > 0
assert result.optimization_type == "image_compression"
# Verify optimized file exists and is smaller
assert result.optimized_path.exists()
assert result.optimized_path.stat().st_size < png_path.stat().st_size
def test_jpeg_quality_optimization(self):
"""Test JPEG quality optimization with configurable settings."""
optimizer = AssetOptimizer()
jpeg_path = self.test_files_dir / "high_quality.jpg"
result = optimizer.optimize_image(
jpeg_path,
target_quality=85,
max_width=1000
)
assert result.original_size > result.optimized_size
assert result.quality_maintained >= 85
# Verify image dimensions were reduced if needed
with Image.open(result.optimized_path) as img:
assert img.width <= 1000
def test_svg_optimization_and_minification(self):
"""Test SVG optimization and minification."""
optimizer = AssetOptimizer()
svg_path = self.test_files_dir / "diagram.svg"
result = optimizer.optimize_svg(svg_path)
assert result.original_size > result.optimized_size
# Verify comments and whitespace were removed
optimized_content = result.optimized_path.read_text()
assert "<!-- This is a comment" not in optimized_content
assert len(optimized_content) < svg_path.read_text().__len__()
def test_pdf_compression(self):
"""Test PDF compression for document assets."""
optimizer = AssetOptimizer()
pdf_path = self.test_files_dir / "document.pdf"
result = optimizer.optimize_pdf(pdf_path)
# For mock PDF, optimization might not reduce size significantly
assert isinstance(result, OptimizationResult)
assert result.optimization_type == "pdf_compression"
def test_thumbnail_generation(self):
"""Test thumbnail generation for images."""
transformer = AssetTransformer()
image_path = self.test_files_dir / "large_image.png"
thumbnail_result = transformer.generate_thumbnail(
image_path,
size=(150, 150),
quality=80
)
assert thumbnail_result.thumbnail_path.exists()
# Verify thumbnail properties
with Image.open(thumbnail_result.thumbnail_path) as thumb:
assert thumb.width <= 150
assert thumb.height <= 150
# Verify thumbnail is much smaller than original
original_size = image_path.stat().st_size
thumbnail_size = thumbnail_result.thumbnail_path.stat().st_size
assert thumbnail_size < original_size * 0.5 # At least 50% smaller
def test_multi_resolution_variants(self):
"""Test generation of multi-resolution asset variants."""
transformer = AssetTransformer()
image_path = self.test_files_dir / "large_image.png"
variants = transformer.generate_resolution_variants(
image_path,
resolutions=[(800, 600), (400, 300), (200, 150)]
)
assert len(variants) == 3
for variant in variants:
assert variant.variant_path.exists()
with Image.open(variant.variant_path) as img:
assert img.width in [800, 400, 200]
def test_watermarking_functionality(self):
"""Test watermarking and metadata embedding."""
transformer = AssetTransformer()
image_path = self.test_files_dir / "large_image.png"
watermarked = transformer.add_watermark(
image_path,
watermark_text="© Test Project",
position="bottom_right",
opacity=0.7
)
assert watermarked.watermarked_path.exists()
# Verify watermarked image is different from original
original_size = image_path.stat().st_size
watermarked_size = watermarked.watermarked_path.stat().st_size
# Size might be slightly different due to compression
assert abs(watermarked_size - original_size) / original_size < 0.1
def test_content_analysis_image_properties(self):
"""Test image dimension and color profile analysis."""
analyzer = ContentAnalyzer()
image_path = self.test_files_dir / "large_image.png"
analysis = analyzer.analyze_image(image_path)
assert analysis.width == 2000
assert analysis.height == 1500
assert analysis.format == "PNG"
assert analysis.mode in ["RGB", "RGBA"]
assert analysis.has_transparency is not None
# Test color profile analysis
assert hasattr(analysis, 'dominant_colors')
assert hasattr(analysis, 'color_histogram')
def test_document_content_extraction(self):
"""Test document content extraction and indexing."""
analyzer = ContentAnalyzer()
text_path = self.test_files_dir / "document.txt"
analysis = analyzer.analyze_document(text_path)
assert "sample text document" in analysis.extracted_text.lower()
assert analysis.word_count > 0
assert analysis.character_count > 0
assert len(analysis.keywords) > 0
# Test language detection
assert hasattr(analysis, 'detected_language')
def test_similarity_detection_exact_duplicates(self):
"""Test similarity detection for exact duplicate assets."""
detector = SimilarityDetector()
# Create identical files
file1 = self.test_files_dir / "duplicate1.txt"
file2 = self.test_files_dir / "duplicate2.txt"
content = "This is identical content"
file1.write_text(content)
file2.write_text(content)
similarity = detector.calculate_similarity(file1, file2)
assert similarity.similarity_score == 1.0
assert similarity.is_exact_duplicate is True
assert similarity.similarity_type == "exact_match"
def test_similarity_detection_near_duplicates(self):
"""Test similarity detection for near-duplicate images."""
detector = SimilarityDetector()
# Create similar images (slightly different)
image1 = Image.new('RGB', (100, 100), color='red')
image2 = Image.new('RGB', (100, 100), color=(255, 10, 10)) # Slightly different red
path1 = self.test_files_dir / "similar1.png"
path2 = self.test_files_dir / "similar2.png"
image1.save(path1)
image2.save(path2)
similarity = detector.calculate_image_similarity(path1, path2)
assert similarity.similarity_score > 0.9 # Very similar
assert similarity.similarity_score < 1.0 # Not identical
assert similarity.similarity_type == "near_duplicate"
def test_content_based_categorization(self):
"""Test content-based asset categorization."""
analyzer = ContentAnalyzer()
# Test image categorization
image_path = self.test_files_dir / "large_image.png"
category = analyzer.categorize_asset(image_path)
assert category.primary_category == "image"
assert category.sub_category in ["photograph", "graphic", "diagram"]
assert category.confidence > 0.5
# Test document categorization
text_path = self.test_files_dir / "document.txt"
category = analyzer.categorize_asset(text_path)
assert category.primary_category == "document"
assert category.sub_category in ["text", "article", "note"]
def test_batch_optimization_workflow(self):
"""Test batch optimization workflow for multiple assets."""
optimizer = AssetOptimizer(profile=OptimizationProfile.BALANCED)
# Add all test files to batch
batch_files = list(self.test_files_dir.glob("*"))
results = optimizer.optimize_batch(
batch_files,
max_concurrent=2,
progress_callback=Mock()
)
assert len(results) == len(batch_files)
# Verify each result
for result in results:
assert isinstance(result, OptimizationResult)
if result.success:
assert result.optimized_path.exists()
# Calculate total savings
total_original = sum(r.original_size for r in results if r.success)
total_optimized = sum(r.optimized_size for r in results if r.success)
total_savings = total_original - total_optimized
assert total_savings >= 0 # Should never increase size significantly
def test_configurable_optimization_profiles(self):
"""Test different optimization profiles with varying aggressiveness."""
conservative = AssetOptimizer(profile=OptimizationProfile.CONSERVATIVE)
balanced = AssetOptimizer(profile=OptimizationProfile.BALANCED)
aggressive = AssetOptimizer(profile=OptimizationProfile.AGGRESSIVE)
image_path = self.test_files_dir / "high_quality.jpg"
# Test different profiles produce different results
result_conservative = conservative.optimize_image(image_path)
result_balanced = balanced.optimize_image(image_path)
result_aggressive = aggressive.optimize_image(image_path)
# Aggressive should save more space than conservative
assert result_aggressive.size_reduction_percent >= result_conservative.size_reduction_percent
# Quality should be preserved better in conservative mode
assert result_conservative.quality_maintained >= result_aggressive.quality_maintained
def test_asset_metrics_collection(self):
"""Test comprehensive asset metrics collection."""
metrics_collector = AssetMetrics()
# Analyze all test assets
for asset_path in self.test_files_dir.glob("*"):
metrics = metrics_collector.collect_metrics(asset_path)
assert hasattr(metrics, 'file_size')
assert hasattr(metrics, 'creation_time')
assert hasattr(metrics, 'mime_type')
assert hasattr(metrics, 'optimization_potential')
if asset_path.suffix.lower() in ['.png', '.jpg', '.jpeg']:
assert hasattr(metrics, 'image_properties')
assert metrics.image_properties.width > 0
assert metrics.image_properties.height > 0
# Test aggregated metrics
summary = metrics_collector.get_summary()
assert summary.total_assets > 0
assert summary.total_size > 0
assert summary.optimization_potential_percent >= 0