Completed Phase 2 of the schema-of-schemas implementation with full markdown schema support. This enables schemas to be authored as markdown files with rich documentation and embedded JSON schemas. Core Implementation (markitect/schema_loader.py): - MarkdownSchemaLoader class with comprehensive parsing capabilities - YAML frontmatter extraction with error handling - JSON code block extraction with section preference (## Schema Definition) - Metadata merging with x-markitect-source tracking - Schema saving with template support and round-trip capability - Helper methods: list_json_blocks(), validate_schema_structure() Test Coverage (tests/test_schema_loader.py): - 35 comprehensive unit tests (100% passing) - Tests for loading, parsing, saving, round-trip conversion - Edge case handling (empty files, binary files, malformed blocks) - Fixed binary file test to use invalid UTF-8 sequences Example Schema (markitect/schemas/manpage-schema-v1.0.md): - First markdown schema following naming convention - Complete manpage schema with frontmatter + documentation + JSON - Demonstrates section classification and content control - Shows proper structure for future schema authors Documentation (roadmap/schema-of-schemas/SCHEMA_LOADER_GUIDE.md): - Comprehensive user guide (600+ lines) - API reference with examples - Best practices and troubleshooting - Integration patterns for CLI and validator Progress Tracking: - Updated TODO.md with Phase 2 completion - Updated CHANGELOG.md with implementation details - Next: Phase 3 - Schema-for-Schemas Metaschema 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
12 KiB
Markdown Schema Loader - User Guide
Version: 1.0 Status: Implemented Created: 2026-01-04
Overview
The Markdown Schema Loader enables MarkiTect to load JSON schemas from markdown files, combining rich documentation with machine-readable validation rules. This aligns with MarkiTect's markdown-first philosophy while maintaining JSON Schema compatibility.
Markdown Schema Format
A markdown schema file consists of three parts:
- YAML Frontmatter: Metadata about the schema
- Documentation: Rich markdown content explaining the schema
- Schema Definition: JSON schema in a code block
Example Structure
---
schema-id: "https://markitect.dev/schemas/domain/v1.0"
version: "1.0.0"
status: "stable"
---
# Schema Title v1.0
## Overview
Description of what this schema validates...
## Usage
How to use this schema...
## Schema Definition
```json
{
"$schema": "http://json-schema.org/draft-07/schema#",
"title": "My Schema",
"type": "object",
...
}
Version History
- v1.0.0 - Initial version
## Frontmatter Metadata
### Required Fields
None are strictly required, but these are recommended:
| Field | Type | Description | Example |
|-------|------|-------------|---------|
| `schema-id` | string | Canonical URI for the schema | `https://markitect.dev/schemas/manpage/v1.0` |
| `version` | string | SemVer version | `1.0.0` |
| `status` | string | Lifecycle status | `stable`, `draft`, `deprecated` |
### Optional Fields
| Field | Type | Description |
|-------|------|-------------|
| `domain` | string | Schema domain name |
| `description` | string | Brief schema description |
| `authors` | array | List of authors |
| `created` | string | Creation date (ISO 8601) |
| `updated` | string | Last update date (ISO 8601) |
### Metadata Merging
Frontmatter metadata takes precedence over schema fields:
- `schema-id` → `$id` in the schema
- `version` → `version` in the schema
- `status` → `x-markitect-metadata.status` in the schema
All frontmatter is preserved in `x-markitect-source.frontmatter`.
## JSON Schema Extraction
### Schema Definition Section
The loader prefers JSON blocks under a `## Schema Definition` heading:
```markdown
## Schema Definition
```json
{
"$schema": "http://json-schema.org/draft-07/schema#",
...
}
### Fallback Behavior
If no `## Schema Definition` section exists, the loader uses the **first** JSON code block in the file.
### Multiple JSON Blocks
You can include multiple JSON blocks in documentation:
```markdown
## Example Usage
```json
{
"name": "example",
"version": "1.0"
}
Schema Definition
{
"$schema": "http://json-schema.org/draft-07/schema#",
"properties": {
"name": {"type": "string"},
"version": {"type": "string"}
}
}
The loader will use the schema under `## Schema Definition` heading.
## Using the Loader
### Python API
```python
from pathlib import Path
from markitect.schema_loader import MarkdownSchemaLoader
# Create loader instance
loader = MarkdownSchemaLoader()
# Load schema from markdown
schema_data = loader.load_schema(Path("manpage-schema-v1.0.md"))
# Access components
schema = schema_data['schema'] # JSON Schema dict
metadata = schema_data['metadata'] # Frontmatter dict
docs = schema_data['documentation'] # Full markdown content
source = schema_data['source_file'] # Source file path
# Use the schema
print(f"Loaded: {schema['title']}")
print(f"Version: {schema['version']}")
print(f"Status: {metadata['status']}")
Loading from Markdown
# Load schema
schema_data = loader.load_schema(Path("my-schema-v1.0.md"))
# Check for issues
issues = loader.validate_schema_structure(schema_data['schema'])
if issues:
for issue in issues:
print(f"⚠️ {issue}")
Saving to Markdown
# Create a schema
schema = {
"$schema": "http://json-schema.org/draft-07/schema#",
"title": "My Schema",
"version": "1.0.0",
"type": "object",
"properties": {
"name": {"type": "string"}
}
}
# Save as markdown
loader.save_schema(
schema=schema,
md_path=Path("my-schema-v1.0.md"),
frontmatter={
"schema-id": "https://example.com/schemas/my-schema/v1.0",
"status": "draft"
}
)
Round-Trip Conversion
# Load existing JSON schema
import json
json_schema = json.loads(Path("old-schema.json").read_text())
# Save as markdown
loader.save_schema(
schema=json_schema,
md_path=Path("new-schema-v1.0.md")
)
# Load it back
schema_data = loader.load_schema(Path("new-schema-v1.0.md"))
# Schemas are equivalent
assert schema_data['schema']['title'] == json_schema['title']
Advanced Features
Listing JSON Blocks
Useful for debugging when multiple JSON blocks exist:
content = Path("schema.md").read_text()
blocks = loader.list_json_blocks(content)
print(f"Found {len(blocks)} JSON blocks:")
for position, json_content in blocks:
print(f" Position {position}: {len(json_content)} chars")
Schema Structure Validation
Check for recommended fields and conventions:
issues = loader.validate_schema_structure(schema)
for issue in issues:
print(f"⚠️ {issue}")
# Example output:
# ⚠️ Missing recommended field: $id
# ⚠️ Missing MarkiTect convention: version field
Custom Templates
Use custom markdown templates for saving schemas:
template = """---
{frontmatter_yaml}
---
# {title}
{description}
## Schema
```json
{schema_json}
"""
loader.save_schema( schema=schema, md_path=Path("custom-schema-v1.0.md"), template=template )
## Error Handling
### Common Errors
| Error | Cause | Solution |
|-------|-------|----------|
| `FileNotFoundError` | Schema file doesn't exist | Check file path |
| `SchemaNotFoundError` | No JSON block in markdown | Add ```json code block |
| `InvalidSchemaFormatError` | Invalid JSON or YAML | Check syntax |
| `SchemaFilenameError` | Invalid filename format | Use `{domain}-schema-v{major}.{minor}.md` |
### Example Error Handling
```python
from markitect.schema_loader import (
MarkdownSchemaLoader,
SchemaNotFoundError,
InvalidSchemaFormatError
)
loader = MarkdownSchemaLoader()
try:
schema_data = loader.load_schema(Path("my-schema.md"))
except FileNotFoundError as e:
print(f"❌ File not found: {e}")
except SchemaNotFoundError as e:
print(f"❌ No schema in file: {e}")
except InvalidSchemaFormatError as e:
print(f"❌ Invalid format: {e}")
Best Practices
1. Use Schema Definition Section
Always place the main schema under ## Schema Definition:
## Schema Definition
```json
{...}
### 2. Include Frontmatter
Provide metadata for better discoverability:
```yaml
---
schema-id: "https://markitect.dev/schemas/domain/v1.0"
version: "1.0.0"
status: "stable"
---
3. Add Rich Documentation
Explain the schema purpose, usage, and examples:
## Overview
This schema validates...
## Usage
```bash
markitect validate doc.md --schema my-schema-v1.0
Examples
...
### 4. Version Your Schemas
Follow the naming convention:
- Initial: `my-schema-v1.0.md`
- Minor update: `my-schema-v1.1.md`
- Breaking change: `my-schema-v2.0.md`
### 5. Validate Structure
Always check for common issues:
```python
issues = loader.validate_schema_structure(schema)
if not issues:
print("✅ Schema structure is valid")
Integration with MarkiTect
CLI Usage (Future)
Once integrated with the CLI, you'll be able to:
# Ingest markdown schema
markitect schema-ingest manpage-schema-v1.0.md
# Validate against markdown schema
markitect validate document.md --schema manpage-schema-v1.0
# Export schema
markitect schema-get manpage-schema-v1.0 --output json
Validator Integration
The SchemaValidator will automatically detect .md schemas:
from markitect.validator import SchemaValidator
validator = SchemaValidator()
validator.validate(
document="my-doc.md",
schema="manpage-schema-v1.0.md" # .md extension auto-detected
)
Markdown Schema Template
Here's a complete template for creating new schemas:
---
schema-id: "https://markitect.dev/schemas/YOUR-DOMAIN/v1.0"
version: "1.0.0"
status: "draft"
domain: "YOUR-DOMAIN"
description: "Brief description of what this schema validates"
authors:
- "Your Name <email@example.com>"
created: "2026-01-04"
---
# YOUR-DOMAIN Schema v1.0
## Overview
Detailed description of what this schema validates and why it exists.
## Features
- Feature 1
- Feature 2
- Feature 3
## Usage
### Validating Documents
```bash
markitect validate document.md --schema YOUR-DOMAIN-schema-v1.0
Common Validation Errors
- Error Type 1: Description and solution
- Error Type 2: Description and solution
Schema Definition
{
"$schema": "http://json-schema.org/draft-07/schema#",
"title": "YOUR DOMAIN Schema",
"description": "Schema description",
"type": "object",
"properties": {
"field1": {
"type": "string",
"description": "Description of field1"
}
},
"required": ["field1"]
}
Examples
Valid Document
Example of valid content...
Invalid Document
Example of invalid content...
Version History
v1.0.0 (2026-01-04)
- Initial version
- Feature A
- Feature B
Related Documentation
## Testing
The loader has comprehensive test coverage:
```bash
# Run all loader tests
pytest tests/test_schema_loader.py -v
# Run specific test class
pytest tests/test_schema_loader.py::TestMarkdownSchemaLoader -v
# Check coverage
pytest tests/test_schema_loader.py --cov=markitect.schema_loader
Test Results: 35/35 tests passing (100%)
Implementation Details
Regex Patterns
The loader uses these regex patterns:
# Frontmatter pattern
r'^---\s*\n(.*?)\n---\s*\n'
# JSON code block pattern
r'```json\s*\n(.*?)\n```'
# Schema Definition section pattern
r'##\s+Schema Definition\s*\n'
Metadata Merging
The _merge_metadata method:
- Copies the original schema
- Adds
x-markitect-sourcewith file metadata - Merges frontmatter fields:
schema-id→$idversion→versionstatus→x-markitect-metadata.status
File Encoding
All files are read/written as UTF-8. Invalid UTF-8 sequences raise InvalidSchemaFormatError.
Troubleshooting
Schema Not Found
Problem: SchemaNotFoundError: No JSON schema found
Solutions:
- Ensure you have a ```json code block
- Check the JSON syntax is valid
- Verify the code block is properly closed with ```
Invalid YAML Frontmatter
Problem: InvalidSchemaFormatError: Invalid YAML frontmatter
Solutions:
- Check YAML syntax (indentation, colons, quotes)
- Ensure frontmatter is between
---delimiters - Verify frontmatter is at the start of file
Binary File Error
Problem: InvalidSchemaFormatError: Failed to read schema file
Solutions:
- Ensure file is text, not binary
- Check file encoding is UTF-8
- Verify file isn't corrupted
See Also
- Schema Naming Specification
- Schema Management Workplan
- Phase 2 Documentation
- Example Markdown Schema
Changelog
v1.0.0 (2026-01-04)
- Initial implementation
- 35 unit tests (100% passing)
- Frontmatter extraction with YAML parsing
- JSON code block extraction with section preference
- Metadata merging with x-markitect-source tracking
- Schema saving with template support
- Round-trip save/load capability
- Helper methods for validation and debugging