- Updated GAMEPLAN.md to reflect decomposed scope after creating separate issues - Issue #38 now focuses specifically on content-stats and content-get commands - Phase 1 (db-data command restructuring) marked as completed - Related issues clearly referenced: #41 (frontmatter), #42 (contentmatter), #43 (tailmatter) - Updated timeline from 2-3 weeks to 3-5 days for focused scope - Refined success metrics and technical architecture for content commands only Changes made: - Objective updated to reflect content commands focus - Implementation phases restructured with Phase 1 completed - Test organization simplified to current focus - Technical architecture focused on content_processor.py module - Success metrics updated for 2 commands instead of 15+ - Development order reflects completed foundation work Related to Issue #38: Access metadata, frontmatter, content separately in CLI Following user request: "Create separate new issues for frontmatter, contentmatter, tailmatter support respectively" 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
7.1 KiB
7.1 KiB
GAMEPLAN for Issue #38: Content Commands - Access Content Stats and Raw Content Separately
🎯 OBJECTIVE (Updated Scope)
Implement dedicated CLI commands for content analysis and raw content access, completing the foundation of granular markdown component access. This issue now focuses specifically on content commands after decomposing the original broader scope into separate issues:
- Issue #41: Frontmatter Commands
- Issue #42: Contentmatter Commands
- Issue #43: Tailmatter Commands
📋 REQUIREMENTS ANALYSIS
Current State Assessment:
- ✅
markitect metadatacommand renamed todb-data(Phase 1 completed) - ✅ Backward compatibility maintained with deprecation warnings
- 🎯 Current Focus: Implement content analysis and raw content access commands
Target Architecture (Content Commands Only):
Based on MarkdownMatters.md specification, this issue focuses on:
- Content Statistics: Word count, line count, heading analysis, link analysis
- Raw Content Access: Extract clean markdown content without frontmatter/tailmatter
- Content Processing: Strip matter blocks, analyze structure, provide metrics
🚀 IMPLEMENTATION PHASES
Phase 1: Command Restructuring (Foundation) ✅ COMPLETED
- Rename
metadatacommand todb-data - Update all references in codebase and tests
- Maintain backward compatibility with deprecation warnings
- Update CLI help and documentation
Phase 2: Content Commands (Current Focus)
- Create
content_processor.pymodule with ContentStats and ContentProcessor classes - Implement
content-statscommand - Statistics about content (word count, line count, headings, etc.) - Implement
content-get [path]command - Echo content without frontmatter and tailmatter - Comprehensive testing for both commands
- Integration with existing CLI patterns
Related Issues (Now Separate)
The following phases have been moved to separate issues for focused development:
- Phase 3: Frontmatter Commands → Issue #41: Frontmatter Commands - YAML/JSON Header Manipulation
- Phase 4: Contentmatter Commands → Issue #42: Contentmatter Commands - MMD Key-Value Processing
- Phase 5: Tailmatter Commands → Issue #43: Tailmatter Commands - QA and Editorial Metadata Management
🧪 TESTING STRATEGY
Test Organization (Updated for Content Commands Focus):
- ✅
test_issue_38_command_restructuring.py- Phase 1 tests (completed) - 🎯
test_issue_38_content_commands.py- Phase 2 content commands tests (current focus)
Test Categories:
- Command Existence Tests - Verify all commands are properly registered
- Functionality Tests - Test core behavior for each command
- Error Handling Tests - Invalid files, missing keys, malformed content
- Format Support Tests - JSON, YAML, table output formats
- Integration Tests - Commands working together in workflows
- Performance Tests - Large file handling and response times
🏗️ TECHNICAL ARCHITECTURE
New Module Structure (Content Commands Focus):
markitect/
├── content_processor.py # Content parsing and analysis (THIS ISSUE)
└── [Other processors moved to separate issues]
Data Models (Content Commands Focus):
@dataclass
class ContentStats:
word_count: int
line_count: int
character_count: int
heading_counts: Dict[int, int] # level -> count
link_count: int
external_link_count: int
image_count: int
code_block_count: int
list_item_count: int
class ContentProcessor:
def analyze_content(self, content: str) -> ContentStats
def extract_content(self, markdown: str) -> str
def strip_frontmatter(self, content: str) -> str
def strip_tailmatter(self, content: str) -> str
CLI Commands (This Issue):
content-stats- Comprehensive content analysis and statisticscontent-get- Extract raw content without frontmatter/tailmatter- ✅
db-data- Complete data access (renamed from metadata, completed)
📊 SUCCESS METRICS
Functional Success (Updated for Content Commands):
- 2 new content CLI commands implemented and working
- Complete test coverage (>95%) for content command functionality
- Backward compatibility maintained for existing workflows
- Performance within 10% of current db-data command speed
User Experience Success:
- Intuitive command naming following consistent patterns
- Comprehensive help documentation for all commands
- Consistent output formatting across all commands
- Clear error messages for all failure scenarios
Technical Success (Content Commands Focus):
- Clean content processing module with clear responsibilities
- Extensible ContentProcessor architecture for future content analysis features
- Efficient markdown parsing and content extraction
- Thread-safe operation for concurrent content analysis
⚡ IMPLEMENTATION APPROACH
TDD8 Methodology:
- ISSUE: Break down into manageable sub-issues for each phase
- TEST: Write comprehensive tests for each command before implementation
- RED: Ensure tests fail initially (proper TDD red state)
- GREEN: Implement minimal code to make tests pass
- REFACTOR: Clean up implementation and optimize performance
- DOCUMENT: Update CLI help, README, and user documentation
- REFINE: Performance testing and edge case handling
- PUBLISH: Integration testing and final validation
Development Order (Updated for Content Commands Focus):
- ✅ Complete Phase 1 (command restructuring) - DONE
- 🎯 Implement Phase 2 (content commands) - CURRENT FOCUS
- Create content_processor.py module
- Implement content-stats command
- Implement content-get command
- Comprehensive testing
🎯 IMMEDIATE NEXT STEPS
- ✅ Phase 1 Tests and Implementation - COMPLETED
- ✅ Implement db-data Command - COMPLETED
- 🎯 Create Content Processor Module: Foundation for content analysis (NEXT)
- 🎯 Implement content-stats Command: Content analysis and statistics (NEXT)
- 🎯 Implement content-get Command: Raw content extraction (NEXT)
📝 NOTES
- Backward Compatibility: Maintain
metadatacommand with deprecation warning - Performance: Cache parsed components to avoid re-parsing for related commands
- Error Handling: Graceful degradation for malformed markdown files
- Output Formats: Support table, JSON, YAML formats consistently across all commands
- Documentation: Reference MarkdownMatters.md specification for implementation details
Estimated Timeline: 3-5 days for content commands implementation Risk Level: Low (focused scope, clear requirements, foundation completed) Dependencies: Existing CLI infrastructure, markdown parsing capabilities
🔗 RELATED ISSUES
- Issue #41: Frontmatter Commands - YAML/JSON Header Manipulation
- Issue #42: Contentmatter Commands - MMD Key-Value Processing
- Issue #43: Tailmatter Commands - QA and Editorial Metadata Management
- Issue #39: Database Command Reorganization (foundation completed)