Files
markitect-main/capabilities/markitect-content/README.md

2.6 KiB

MarkiTect Content Capability

A self-contained capability for parsing and analyzing MarkdownMatters content without frontmatter and tailmatter zones.

Overview

The markitect-content capability provides content extraction and statistics functionality for MarkdownMatters documents. It cleanly separates main document content from metadata zones (frontmatter/tailmatter) and provides comprehensive content analysis.

Features

  • Content Extraction: Extract main markdown content without frontmatter/tailmatter zones
  • Content Statistics: Calculate word count, line count, paragraph count, and character count
  • CLI Commands: Direct command-line access to content operations
  • Contentmatter Preservation: Preserves inline metadata (MMD key-value pairs) as part of content

API

Core Classes

ContentParser

Main parser class for content extraction and analysis.

from markitect_content import ContentParser

parser = ContentParser()

# Extract content without matter zones
content = parser.extract_content(text)

# Calculate content statistics
stats = parser.calculate_stats(content)

ContentStats

Statistics data structure with content metrics.

from markitect_content import ContentStats

# Stats object contains:
# - word_count: int
# - line_count: int
# - paragraph_count: int
# - character_count: int

# Convert to dictionary
stats_dict = stats.to_dict()

CLI Commands

content-get

Extract content without frontmatter and tailmatter.

markitect content-get --file document.md

content-stats

Calculate content statistics.

markitect content-stats --file document.md --format json
markitect content-stats --file document.md --format text

Content Processing Rules

  1. Frontmatter Removal: Removes YAML frontmatter blocks (---...---)
  2. Tailmatter Removal: Removes tailmatter blocks (yaml tailmatter...)
  3. Contentmatter Preservation: Keeps inline MMD key-value pairs
  4. Content Statistics: Counts are calculated on cleaned content only

Installation

Install as an editable dependency in your MarkiTect environment:

pip install -e capabilities/markitect-content/

Testing

Run the capability test suite:

cd capabilities/markitect-content/
pytest tests/

Compliance

This capability follows the ComposableRepositoryParadigm:

  • Src layout (PEP 660 compliant)
  • Unidirectional dependencies
  • Self-contained with own tests
  • Independent configuration
  • Clean API boundaries

Dependencies

  • click>=8.0.0 (for CLI commands)
  • pytest>=7.0.0 (dev dependency for testing)