Files
markitect-main/capabilities/markitect-content/README.md

104 lines
2.6 KiB
Markdown

# MarkiTect Content Capability
A self-contained capability for parsing and analyzing MarkdownMatters content without frontmatter and tailmatter zones.
## Overview
The markitect-content capability provides content extraction and statistics functionality for MarkdownMatters documents. It cleanly separates main document content from metadata zones (frontmatter/tailmatter) and provides comprehensive content analysis.
## Features
- **Content Extraction**: Extract main markdown content without frontmatter/tailmatter zones
- **Content Statistics**: Calculate word count, line count, paragraph count, and character count
- **CLI Commands**: Direct command-line access to content operations
- **Contentmatter Preservation**: Preserves inline metadata (MMD key-value pairs) as part of content
## API
### Core Classes
#### `ContentParser`
Main parser class for content extraction and analysis.
```python
from markitect_content import ContentParser
parser = ContentParser()
# Extract content without matter zones
content = parser.extract_content(text)
# Calculate content statistics
stats = parser.calculate_stats(content)
```
#### `ContentStats`
Statistics data structure with content metrics.
```python
from markitect_content import ContentStats
# Stats object contains:
# - word_count: int
# - line_count: int
# - paragraph_count: int
# - character_count: int
# Convert to dictionary
stats_dict = stats.to_dict()
```
### CLI Commands
#### `content-get`
Extract content without frontmatter and tailmatter.
```bash
markitect content-get --file document.md
```
#### `content-stats`
Calculate content statistics.
```bash
markitect content-stats --file document.md --format json
markitect content-stats --file document.md --format text
```
## Content Processing Rules
1. **Frontmatter Removal**: Removes YAML frontmatter blocks (`---...---`)
2. **Tailmatter Removal**: Removes tailmatter blocks (````yaml tailmatter...````)
3. **Contentmatter Preservation**: Keeps inline MMD key-value pairs
4. **Content Statistics**: Counts are calculated on cleaned content only
## Installation
Install as an editable dependency in your MarkiTect environment:
```bash
pip install -e capabilities/markitect-content/
```
## Testing
Run the capability test suite:
```bash
cd capabilities/markitect-content/
pytest tests/
```
## Compliance
This capability follows the ComposableRepositoryParadigm:
- ✅ Src layout (PEP 660 compliant)
- ✅ Unidirectional dependencies
- ✅ Self-contained with own tests
- ✅ Independent configuration
- ✅ Clean API boundaries
## Dependencies
- click>=8.0.0 (for CLI commands)
- pytest>=7.0.0 (dev dependency for testing)