# Gameplan: Issue #141 Asset Management - Variant B Implementation **Date**: October 8, 2025 **Issue**: #141 - Asset Management Concepts **Variant**: B - Content-Addressable Package System with Symlinks **Status**: 📋 **IMPLEMENTATION GAMEPLAN** ## Executive Summary This gameplan outlines the implementation of **Variant B** from Issue #141, which provides a **Content-Addressable Package System with Symlinks** for managing images and file includes in markitect. The implementation focuses on: 1. **Package-based document storage** (.mdpkg ZIP files) 2. **Symlink-based deduplication** with shared asset library 3. **CLI integration** with markitect commands 4. **Gradual rollout** with backward compatibility ## Architecture Overview ``` markitect_packages/ ├── packages/ # Generated .mdpkg files │ ├── document_a.mdpkg │ └── document_b.mdpkg ├── shared_assets/ # Deduplicated asset library │ ├── images/ │ │ ├── content_hash_1.png │ │ └── content_hash_2.jpg │ └── registry.json # Asset registry └── workspace/ # Working directory with symlinks ├── document_a/ │ ├── index.md │ └── assets/ # Symlinks to shared_assets │ └── logo.png → ../../shared_assets/images/hash_1.png └── document_b/ ``` ## Current Markitect Integration Points Based on analysis of the existing codebase: ### Existing Modules - **CLI Framework**: `/markitect/cli.py` - Main Click-based CLI with 247KB of commands - **Module Structure**: Organized in packages (finance, issues, legacy, etc.) - **Database Integration**: `/markitect/database.py` - SQLite-based storage - **Configuration**: `/markitect/config_manager.py` - Centralized config management - **Batch Processing**: `/markitect/batch_processor.py` - File processing pipeline ### Integration Strategy - Follow existing patterns in `/markitect/finance/` and `/markitect/issues/` - Use Click command groups for asset management commands - Leverage existing `DatabaseManager` for metadata storage - Integrate with `ConfigurationManager` for user settings ## Implementation Phases ### Phase 1: Core Asset Management Module (Week 1-2) **Deliverables:** 1. **`/markitect/assets/` module structure** 2. **Asset registry and deduplication engine** 3. **Basic CLI commands** 4. **Unit tests** **Components:** ``` markitect/assets/ ├── __init__.py # Module exports ├── registry.py # AssetRegistry class ├── deduplicator.py # AssetDeduplicator class ├── packager.py # MarkdownPackager class ├── cli.py # Click command group ├── exceptions.py # Asset-specific exceptions └── constants.py # Configuration constants ``` **Key Classes:** - `AssetRegistry` - JSON-based asset metadata storage - `AssetDeduplicator` - Symlink-based deduplication - `MarkdownPackager` - .mdpkg creation/extraction - `AssetManager` - High-level API coordinator ### Phase 2: CLI Integration (Week 3) **Deliverables:** 1. **Full CLI command suite** 2. **Integration with existing markitect CLI** 3. **Configuration management** 4. **User documentation** **CLI Commands:** ```bash # Asset Management markitect asset add [--name NAME] markitect asset list [--document DOC] [--unused] markitect asset dedupe [--dry-run] markitect asset stats markitect asset cleanup [--orphaned] # Package Management markitect package create markitect package extract [--name NAME] markitect package list markitect package validate # Workspace Management markitect workspace init [--template TEMPLATE] markitect workspace status markitect workspace sync [--document DOC] ``` ### Phase 3: Advanced Features (Week 4-5) **Deliverables:** 1. **Batch processing integration** 2. **Database schema extensions** 3. **Performance optimizations** 4. **Integration tests** **Features:** - **Batch Import**: Process entire directories of assets - **Auto-discovery**: Scan markdown files for asset references - **Format Optimization**: Automatic image compression/conversion - **Workspace Templates**: Pre-configured project structures - **Asset Search**: Content-based asset discovery ### Phase 4: Production Readiness (Week 6) **Deliverables:** 1. **Error handling and recovery** 2. **Configuration validation** 3. **Performance benchmarking** 4. **Documentation completion** **Production Features:** - **Rollback Support**: Undo asset operations - **Conflict Resolution**: Handle symlink/file conflicts - **Cross-platform Support**: Windows symlink alternatives - **Migration Tools**: Import from existing asset workflows ## Technical Specifications ### Module Structure **`markitect/assets/__init__.py`** ```python """Asset Management for Markitect - Issue #141 Variant B Implementation.""" from .registry import AssetRegistry from .deduplicator import AssetDeduplicator from .packager import MarkdownPackager from .manager import AssetManager from .exceptions import AssetError, DuplicationError, PackageError __all__ = [ 'AssetRegistry', 'AssetDeduplicator', 'MarkdownPackager', 'AssetManager', 'AssetError', 'DuplicationError', 'PackageError' ] ``` **CLI Integration Pattern** ```python # In markitect/cli.py from .assets.cli import asset_commands @cli.group() def asset(): """Asset management commands.""" pass cli.add_command(asset_commands, 'asset') ``` ### Database Schema Extensions **Asset Metadata Table** ```sql CREATE TABLE asset_metadata ( content_hash TEXT PRIMARY KEY, original_name TEXT, file_size INTEGER, mime_type TEXT, stored_path TEXT, created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP, last_accessed TIMESTAMP, reference_count INTEGER DEFAULT 0 ); CREATE TABLE asset_references ( id INTEGER PRIMARY KEY AUTOINCREMENT, content_hash TEXT, document_path TEXT, virtual_name TEXT, markdown_line INTEGER, created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP, FOREIGN KEY (content_hash) REFERENCES asset_metadata(content_hash) ); CREATE INDEX idx_asset_refs_document ON asset_references(document_path); CREATE INDEX idx_asset_refs_hash ON asset_references(content_hash); ``` ### Configuration Schema **Asset Management Settings** ```yaml # markitect.yaml asset_management: enabled: true workspace_path: "./markitect_workspace" shared_assets_path: "./markitect_workspace/shared_assets" packages_path: "./markitect_workspace/packages" # Deduplication settings auto_dedupe: true symlink_preferred: true fallback_to_copy: true # Windows compatibility # Package settings compression_level: 6 include_manifest: true validate_on_create: true # Performance settings cache_enabled: true batch_size: 100 max_file_size_mb: 50 ``` ## CLI Command Specifications ### Asset Commands **`markitect asset add`** ```bash # Basic usage markitect asset add logo.png ./project_a --name company_logo.png # Options --name NAME # Virtual name in document (default: original filename) --document PATH # Target document directory (required) --force # Overwrite existing virtual name --no-symlink # Force file copy instead of symlink ``` **`markitect asset list`** ```bash # List all assets markitect asset list # Filter by document markitect asset list --document ./project_a # Show unused assets markitect asset list --unused # Output formats markitect asset list --format json markitect asset list --format table ``` **`markitect asset dedupe`** ```bash # Dry run (show what would be deduplicated) markitect asset dedupe --dry-run # Execute deduplication markitect asset dedupe # Force deduplication of all assets markitect asset dedupe --force ``` ### Package Commands **`markitect package create`** ```bash # Create package from document directory markitect package create ./project_a project_a # Options --output PATH # Output directory (default: workspace/packages) --compression LEVEL # ZIP compression level 0-9 (default: 6) --exclude PATTERN # Exclude files matching pattern --include-sources # Include source markdown files ``` **`markitect package extract`** ```bash # Extract package to workspace markitect package extract project_a.mdpkg # Extract with custom name markitect package extract project_a.mdpkg --name project_a_v2 # Options --output PATH # Output directory (default: workspace/documents) --overwrite # Overwrite existing directory --no-dedupe # Skip deduplication during extraction ``` ## Testing Strategy ### Unit Tests **Test Coverage Areas:** - **Asset Registry**: JSON persistence, hash calculations, metadata management - **Deduplicator**: Content hashing, symlink creation, fallback mechanisms - **Packager**: ZIP creation/extraction, manifest handling, asset resolution - **CLI Commands**: Command parsing, error handling, output formatting **Test Structure:** ``` tests/ ├── test_assets/ │ ├── test_registry.py │ ├── test_deduplicator.py │ ├── test_packager.py │ └── test_cli.py ├── fixtures/ │ ├── test_images/ │ ├── test_documents/ │ └── test_packages/ └── integration/ ├── test_full_workflow.py └── test_cross_platform.py ``` ### Integration Tests **Workflow Tests:** 1. **Complete Asset Lifecycle**: Add → Dedupe → Package → Extract 2. **Cross-Document Sharing**: Multiple docs referencing same assets 3. **Package Portability**: Create on one system, extract on another 4. **Error Recovery**: Broken symlinks, missing files, corrupted packages ### Performance Tests **Benchmarking Scenarios:** - **Large Asset Libraries**: 1000+ assets, multiple documents - **Batch Processing**: Importing entire directories - **Package Operations**: Creating/extracting large packages - **Deduplication Efficiency**: Storage savings measurement ## Risk Mitigation ### Technical Risks **Symlink Compatibility** - **Risk**: Symlinks fail on Windows or restricted filesystems - **Mitigation**: Automatic fallback to file copying - **Detection**: Platform detection and permission testing **Package Corruption** - **Risk**: ZIP files become corrupted during transfer - **Mitigation**: Built-in validation and checksum verification - **Recovery**: Package repair tools and backup strategies **Storage Scalability** - **Risk**: Asset libraries become too large to manage efficiently - **Mitigation**: Lazy loading, pagination, and cleanup tools - **Monitoring**: Storage usage tracking and alerts ### User Experience Risks **Learning Curve** - **Risk**: Users find asset management complex - **Mitigation**: Progressive disclosure, good defaults, clear documentation - **Support**: Interactive tutorials and example workflows **Data Loss** - **Risk**: Assets accidentally deleted or corrupted - **Mitigation**: Confirmation prompts, soft deletion, backup recommendations - **Recovery**: Asset history tracking and restore capabilities ## Success Metrics ### Technical Metrics - **Storage Efficiency**: 30%+ reduction in duplicate asset storage - **Performance**: Asset operations complete in <100ms for typical workloads - **Reliability**: 99.9%+ success rate for package operations - **Compatibility**: Works on Windows, macOS, Linux ### User Adoption Metrics - **CLI Usage**: Asset commands represent 10%+ of total markitect usage - **Package Creation**: Users create 5+ packages per month on average - **Error Rates**: <1% of asset operations result in user-visible errors - **Documentation**: Asset management docs have 95%+ user satisfaction ## Implementation Timeline **Week 1-2: Core Module** - [ ] Asset registry implementation - [ ] Deduplication engine with symlinks - [ ] Basic package creation/extraction - [ ] Unit test suite (80%+ coverage) **Week 3: CLI Integration** - [ ] Complete CLI command suite - [ ] Integration with main markitect CLI - [ ] Configuration management - [ ] User documentation **Week 4-5: Advanced Features** - [ ] Batch processing capabilities - [ ] Database integration - [ ] Performance optimizations - [ ] Integration test suite **Week 6: Production Readiness** - [ ] Error handling and recovery - [ ] Cross-platform testing - [ ] Performance benchmarking - [ ] Release preparation ## Dependencies ### Internal Dependencies - **markitect.database**: Metadata storage integration - **markitect.config_manager**: Configuration management - **markitect.cli**: Command registration and parsing - **markitect.batch_processor**: Bulk operation support ### External Dependencies - **Click**: CLI framework (existing dependency) - **Pathlib**: Path manipulation (standard library) - **Zipfile**: Package creation (standard library) - **Hashlib**: Content hashing (standard library) - **JSON**: Metadata serialization (standard library) - **OS**: Symlink operations (standard library) ### Optional Dependencies - **Pillow**: Image processing and optimization - **Send2trash**: Safe file deletion - **Watchdog**: File system monitoring ## Next Steps 1. **Review and Approval**: Get stakeholder sign-off on this gameplan 2. **Environment Setup**: Prepare development environment and test fixtures 3. **Phase 1 Kickoff**: Begin core module implementation 4. **Continuous Integration**: Set up automated testing pipeline 5. **Documentation**: Start user guide and API documentation This gameplan provides a comprehensive roadmap for implementing Issue #141 Variant B, ensuring robust asset management capabilities while maintaining compatibility with existing markitect workflows. --- **Status**: 📋 **Ready for Implementation - Awaiting Approval**