- Add enhanced AssetManager with database integration and usage tracking - Implement Asset model with from_dict/to_dict conversion methods - Add resolve_asset_references() for linking discovered assets to imports - Integrate AssetDatabase with enhanced schema and performance indexes - Fix database schema constraints and test compatibility issues - Add list_assets_as_objects() method for dict-to-object migration - Resolve 91% of asset management tests (51/56 passing) Key features: * Content-addressable asset storage with deduplication * Database-backed usage statistics and processing logs * Asset reference resolution from markdown files * Enhanced performance with indexing and caching * Object-oriented Asset model with backwards compatibility 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
453 lines
14 KiB
Markdown
453 lines
14 KiB
Markdown
# Gameplan: Issue #141 Asset Management - Variant B Implementation
|
|
|
|
**Date**: October 8, 2025
|
|
**Issue**: #141 - Asset Management Concepts
|
|
**Variant**: B - Content-Addressable Package System with Symlinks
|
|
**Status**: 📋 **IMPLEMENTATION GAMEPLAN**
|
|
|
|
## Executive Summary
|
|
|
|
This gameplan outlines the implementation of **Variant B** from Issue #141, which provides a **Content-Addressable Package System with Symlinks** for managing images and file includes in markitect. The implementation focuses on:
|
|
|
|
1. **Package-based document storage** (.mdpkg ZIP files)
|
|
2. **Symlink-based deduplication** with shared asset library
|
|
3. **CLI integration** with markitect commands
|
|
4. **Gradual rollout** with backward compatibility
|
|
|
|
## Architecture Overview
|
|
|
|
```
|
|
markitect_packages/
|
|
├── packages/ # Generated .mdpkg files
|
|
│ ├── document_a.mdpkg
|
|
│ └── document_b.mdpkg
|
|
├── shared_assets/ # Deduplicated asset library
|
|
│ ├── images/
|
|
│ │ ├── content_hash_1.png
|
|
│ │ └── content_hash_2.jpg
|
|
│ └── registry.json # Asset registry
|
|
└── workspace/ # Working directory with symlinks
|
|
├── document_a/
|
|
│ ├── index.md
|
|
│ └── assets/ # Symlinks to shared_assets
|
|
│ └── logo.png → ../../shared_assets/images/hash_1.png
|
|
└── document_b/
|
|
```
|
|
|
|
## Current Markitect Integration Points
|
|
|
|
Based on analysis of the existing codebase:
|
|
|
|
### Existing Modules
|
|
- **CLI Framework**: `/markitect/cli.py` - Main Click-based CLI with 247KB of commands
|
|
- **Module Structure**: Organized in packages (finance, issues, legacy, etc.)
|
|
- **Database Integration**: `/markitect/database.py` - SQLite-based storage
|
|
- **Configuration**: `/markitect/config_manager.py` - Centralized config management
|
|
- **Batch Processing**: `/markitect/batch_processor.py` - File processing pipeline
|
|
|
|
### Integration Strategy
|
|
- Follow existing patterns in `/markitect/finance/` and `/markitect/issues/`
|
|
- Use Click command groups for asset management commands
|
|
- Leverage existing `DatabaseManager` for metadata storage
|
|
- Integrate with `ConfigurationManager` for user settings
|
|
|
|
## Implementation Phases
|
|
|
|
### Phase 1: Core Asset Management Module (Week 1-2)
|
|
|
|
**Deliverables:**
|
|
1. **`/markitect/assets/` module structure**
|
|
2. **Asset registry and deduplication engine**
|
|
3. **Basic CLI commands**
|
|
4. **Unit tests**
|
|
|
|
**Components:**
|
|
```
|
|
markitect/assets/
|
|
├── __init__.py # Module exports
|
|
├── registry.py # AssetRegistry class
|
|
├── deduplicator.py # AssetDeduplicator class
|
|
├── packager.py # MarkdownPackager class
|
|
├── cli.py # Click command group
|
|
├── exceptions.py # Asset-specific exceptions
|
|
└── constants.py # Configuration constants
|
|
```
|
|
|
|
**Key Classes:**
|
|
- `AssetRegistry` - JSON-based asset metadata storage
|
|
- `AssetDeduplicator` - Symlink-based deduplication
|
|
- `MarkdownPackager` - .mdpkg creation/extraction
|
|
- `AssetManager` - High-level API coordinator
|
|
|
|
### Phase 2: CLI Integration (Week 3)
|
|
|
|
**Deliverables:**
|
|
1. **Full CLI command suite**
|
|
2. **Integration with existing markitect CLI**
|
|
3. **Configuration management**
|
|
4. **User documentation**
|
|
|
|
**CLI Commands:**
|
|
```bash
|
|
# Asset Management
|
|
markitect asset add <file> <document> [--name NAME]
|
|
markitect asset list [--document DOC] [--unused]
|
|
markitect asset dedupe [--dry-run]
|
|
markitect asset stats
|
|
markitect asset cleanup [--orphaned]
|
|
|
|
# Package Management
|
|
markitect package create <document-dir> <package-name>
|
|
markitect package extract <package-file> [--name NAME]
|
|
markitect package list
|
|
markitect package validate <package-file>
|
|
|
|
# Workspace Management
|
|
markitect workspace init [--template TEMPLATE]
|
|
markitect workspace status
|
|
markitect workspace sync [--document DOC]
|
|
```
|
|
|
|
### Phase 3: Advanced Features (Week 4-5)
|
|
|
|
**Deliverables:**
|
|
1. **Batch processing integration**
|
|
2. **Database schema extensions**
|
|
3. **Performance optimizations**
|
|
4. **Integration tests**
|
|
|
|
**Features:**
|
|
- **Batch Import**: Process entire directories of assets
|
|
- **Auto-discovery**: Scan markdown files for asset references
|
|
- **Format Optimization**: Automatic image compression/conversion
|
|
- **Workspace Templates**: Pre-configured project structures
|
|
- **Asset Search**: Content-based asset discovery
|
|
|
|
### Phase 4: Production Readiness (Week 6)
|
|
|
|
**Deliverables:**
|
|
1. **Error handling and recovery**
|
|
2. **Configuration validation**
|
|
3. **Performance benchmarking**
|
|
4. **Documentation completion**
|
|
|
|
**Production Features:**
|
|
- **Rollback Support**: Undo asset operations
|
|
- **Conflict Resolution**: Handle symlink/file conflicts
|
|
- **Cross-platform Support**: Windows symlink alternatives
|
|
- **Migration Tools**: Import from existing asset workflows
|
|
|
|
## Technical Specifications
|
|
|
|
### Module Structure
|
|
|
|
**`markitect/assets/__init__.py`**
|
|
```python
|
|
"""Asset Management for Markitect - Issue #141 Variant B Implementation."""
|
|
|
|
from .registry import AssetRegistry
|
|
from .deduplicator import AssetDeduplicator
|
|
from .packager import MarkdownPackager
|
|
from .manager import AssetManager
|
|
from .exceptions import AssetError, DuplicationError, PackageError
|
|
|
|
__all__ = [
|
|
'AssetRegistry',
|
|
'AssetDeduplicator',
|
|
'MarkdownPackager',
|
|
'AssetManager',
|
|
'AssetError',
|
|
'DuplicationError',
|
|
'PackageError'
|
|
]
|
|
```
|
|
|
|
**CLI Integration Pattern**
|
|
```python
|
|
# In markitect/cli.py
|
|
from .assets.cli import asset_commands
|
|
|
|
@cli.group()
|
|
def asset():
|
|
"""Asset management commands."""
|
|
pass
|
|
|
|
cli.add_command(asset_commands, 'asset')
|
|
```
|
|
|
|
### Database Schema Extensions
|
|
|
|
**Asset Metadata Table**
|
|
```sql
|
|
CREATE TABLE asset_metadata (
|
|
content_hash TEXT PRIMARY KEY,
|
|
original_name TEXT,
|
|
file_size INTEGER,
|
|
mime_type TEXT,
|
|
stored_path TEXT,
|
|
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
|
|
last_accessed TIMESTAMP,
|
|
reference_count INTEGER DEFAULT 0
|
|
);
|
|
|
|
CREATE TABLE asset_references (
|
|
id INTEGER PRIMARY KEY AUTOINCREMENT,
|
|
content_hash TEXT,
|
|
document_path TEXT,
|
|
virtual_name TEXT,
|
|
markdown_line INTEGER,
|
|
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
|
|
FOREIGN KEY (content_hash) REFERENCES asset_metadata(content_hash)
|
|
);
|
|
|
|
CREATE INDEX idx_asset_refs_document ON asset_references(document_path);
|
|
CREATE INDEX idx_asset_refs_hash ON asset_references(content_hash);
|
|
```
|
|
|
|
### Configuration Schema
|
|
|
|
**Asset Management Settings**
|
|
```yaml
|
|
# markitect.yaml
|
|
asset_management:
|
|
enabled: true
|
|
workspace_path: "./markitect_workspace"
|
|
shared_assets_path: "./markitect_workspace/shared_assets"
|
|
packages_path: "./markitect_workspace/packages"
|
|
|
|
# Deduplication settings
|
|
auto_dedupe: true
|
|
symlink_preferred: true
|
|
fallback_to_copy: true # Windows compatibility
|
|
|
|
# Package settings
|
|
compression_level: 6
|
|
include_manifest: true
|
|
validate_on_create: true
|
|
|
|
# Performance settings
|
|
cache_enabled: true
|
|
batch_size: 100
|
|
max_file_size_mb: 50
|
|
```
|
|
|
|
## CLI Command Specifications
|
|
|
|
### Asset Commands
|
|
|
|
**`markitect asset add`**
|
|
```bash
|
|
# Basic usage
|
|
markitect asset add logo.png ./project_a --name company_logo.png
|
|
|
|
# Options
|
|
--name NAME # Virtual name in document (default: original filename)
|
|
--document PATH # Target document directory (required)
|
|
--force # Overwrite existing virtual name
|
|
--no-symlink # Force file copy instead of symlink
|
|
```
|
|
|
|
**`markitect asset list`**
|
|
```bash
|
|
# List all assets
|
|
markitect asset list
|
|
|
|
# Filter by document
|
|
markitect asset list --document ./project_a
|
|
|
|
# Show unused assets
|
|
markitect asset list --unused
|
|
|
|
# Output formats
|
|
markitect asset list --format json
|
|
markitect asset list --format table
|
|
```
|
|
|
|
**`markitect asset dedupe`**
|
|
```bash
|
|
# Dry run (show what would be deduplicated)
|
|
markitect asset dedupe --dry-run
|
|
|
|
# Execute deduplication
|
|
markitect asset dedupe
|
|
|
|
# Force deduplication of all assets
|
|
markitect asset dedupe --force
|
|
```
|
|
|
|
### Package Commands
|
|
|
|
**`markitect package create`**
|
|
```bash
|
|
# Create package from document directory
|
|
markitect package create ./project_a project_a
|
|
|
|
# Options
|
|
--output PATH # Output directory (default: workspace/packages)
|
|
--compression LEVEL # ZIP compression level 0-9 (default: 6)
|
|
--exclude PATTERN # Exclude files matching pattern
|
|
--include-sources # Include source markdown files
|
|
```
|
|
|
|
**`markitect package extract`**
|
|
```bash
|
|
# Extract package to workspace
|
|
markitect package extract project_a.mdpkg
|
|
|
|
# Extract with custom name
|
|
markitect package extract project_a.mdpkg --name project_a_v2
|
|
|
|
# Options
|
|
--output PATH # Output directory (default: workspace/documents)
|
|
--overwrite # Overwrite existing directory
|
|
--no-dedupe # Skip deduplication during extraction
|
|
```
|
|
|
|
## Testing Strategy
|
|
|
|
### Unit Tests
|
|
|
|
**Test Coverage Areas:**
|
|
- **Asset Registry**: JSON persistence, hash calculations, metadata management
|
|
- **Deduplicator**: Content hashing, symlink creation, fallback mechanisms
|
|
- **Packager**: ZIP creation/extraction, manifest handling, asset resolution
|
|
- **CLI Commands**: Command parsing, error handling, output formatting
|
|
|
|
**Test Structure:**
|
|
```
|
|
tests/
|
|
├── test_assets/
|
|
│ ├── test_registry.py
|
|
│ ├── test_deduplicator.py
|
|
│ ├── test_packager.py
|
|
│ └── test_cli.py
|
|
├── fixtures/
|
|
│ ├── test_images/
|
|
│ ├── test_documents/
|
|
│ └── test_packages/
|
|
└── integration/
|
|
├── test_full_workflow.py
|
|
└── test_cross_platform.py
|
|
```
|
|
|
|
### Integration Tests
|
|
|
|
**Workflow Tests:**
|
|
1. **Complete Asset Lifecycle**: Add → Dedupe → Package → Extract
|
|
2. **Cross-Document Sharing**: Multiple docs referencing same assets
|
|
3. **Package Portability**: Create on one system, extract on another
|
|
4. **Error Recovery**: Broken symlinks, missing files, corrupted packages
|
|
|
|
### Performance Tests
|
|
|
|
**Benchmarking Scenarios:**
|
|
- **Large Asset Libraries**: 1000+ assets, multiple documents
|
|
- **Batch Processing**: Importing entire directories
|
|
- **Package Operations**: Creating/extracting large packages
|
|
- **Deduplication Efficiency**: Storage savings measurement
|
|
|
|
## Risk Mitigation
|
|
|
|
### Technical Risks
|
|
|
|
**Symlink Compatibility**
|
|
- **Risk**: Symlinks fail on Windows or restricted filesystems
|
|
- **Mitigation**: Automatic fallback to file copying
|
|
- **Detection**: Platform detection and permission testing
|
|
|
|
**Package Corruption**
|
|
- **Risk**: ZIP files become corrupted during transfer
|
|
- **Mitigation**: Built-in validation and checksum verification
|
|
- **Recovery**: Package repair tools and backup strategies
|
|
|
|
**Storage Scalability**
|
|
- **Risk**: Asset libraries become too large to manage efficiently
|
|
- **Mitigation**: Lazy loading, pagination, and cleanup tools
|
|
- **Monitoring**: Storage usage tracking and alerts
|
|
|
|
### User Experience Risks
|
|
|
|
**Learning Curve**
|
|
- **Risk**: Users find asset management complex
|
|
- **Mitigation**: Progressive disclosure, good defaults, clear documentation
|
|
- **Support**: Interactive tutorials and example workflows
|
|
|
|
**Data Loss**
|
|
- **Risk**: Assets accidentally deleted or corrupted
|
|
- **Mitigation**: Confirmation prompts, soft deletion, backup recommendations
|
|
- **Recovery**: Asset history tracking and restore capabilities
|
|
|
|
## Success Metrics
|
|
|
|
### Technical Metrics
|
|
- **Storage Efficiency**: 30%+ reduction in duplicate asset storage
|
|
- **Performance**: Asset operations complete in <100ms for typical workloads
|
|
- **Reliability**: 99.9%+ success rate for package operations
|
|
- **Compatibility**: Works on Windows, macOS, Linux
|
|
|
|
### User Adoption Metrics
|
|
- **CLI Usage**: Asset commands represent 10%+ of total markitect usage
|
|
- **Package Creation**: Users create 5+ packages per month on average
|
|
- **Error Rates**: <1% of asset operations result in user-visible errors
|
|
- **Documentation**: Asset management docs have 95%+ user satisfaction
|
|
|
|
## Implementation Timeline
|
|
|
|
**Week 1-2: Core Module**
|
|
- [ ] Asset registry implementation
|
|
- [ ] Deduplication engine with symlinks
|
|
- [ ] Basic package creation/extraction
|
|
- [ ] Unit test suite (80%+ coverage)
|
|
|
|
**Week 3: CLI Integration**
|
|
- [ ] Complete CLI command suite
|
|
- [ ] Integration with main markitect CLI
|
|
- [ ] Configuration management
|
|
- [ ] User documentation
|
|
|
|
**Week 4-5: Advanced Features**
|
|
- [ ] Batch processing capabilities
|
|
- [ ] Database integration
|
|
- [ ] Performance optimizations
|
|
- [ ] Integration test suite
|
|
|
|
**Week 6: Production Readiness**
|
|
- [ ] Error handling and recovery
|
|
- [ ] Cross-platform testing
|
|
- [ ] Performance benchmarking
|
|
- [ ] Release preparation
|
|
|
|
## Dependencies
|
|
|
|
### Internal Dependencies
|
|
- **markitect.database**: Metadata storage integration
|
|
- **markitect.config_manager**: Configuration management
|
|
- **markitect.cli**: Command registration and parsing
|
|
- **markitect.batch_processor**: Bulk operation support
|
|
|
|
### External Dependencies
|
|
- **Click**: CLI framework (existing dependency)
|
|
- **Pathlib**: Path manipulation (standard library)
|
|
- **Zipfile**: Package creation (standard library)
|
|
- **Hashlib**: Content hashing (standard library)
|
|
- **JSON**: Metadata serialization (standard library)
|
|
- **OS**: Symlink operations (standard library)
|
|
|
|
### Optional Dependencies
|
|
- **Pillow**: Image processing and optimization
|
|
- **Send2trash**: Safe file deletion
|
|
- **Watchdog**: File system monitoring
|
|
|
|
## Next Steps
|
|
|
|
1. **Review and Approval**: Get stakeholder sign-off on this gameplan
|
|
2. **Environment Setup**: Prepare development environment and test fixtures
|
|
3. **Phase 1 Kickoff**: Begin core module implementation
|
|
4. **Continuous Integration**: Set up automated testing pipeline
|
|
5. **Documentation**: Start user guide and API documentation
|
|
|
|
This gameplan provides a comprehensive roadmap for implementing Issue #141 Variant B, ensuring robust asset management capabilities while maintaining compatibility with existing markitect workflows.
|
|
|
|
---
|
|
|
|
**Status**: 📋 **Ready for Implementation - Awaiting Approval** |