feat: implement Phase 2 - Markdown Schema Loader

Completed Phase 2 of the schema-of-schemas implementation with full
markdown schema support. This enables schemas to be authored as
markdown files with rich documentation and embedded JSON schemas.

Core Implementation (markitect/schema_loader.py):
- MarkdownSchemaLoader class with comprehensive parsing capabilities
- YAML frontmatter extraction with error handling
- JSON code block extraction with section preference (## Schema Definition)
- Metadata merging with x-markitect-source tracking
- Schema saving with template support and round-trip capability
- Helper methods: list_json_blocks(), validate_schema_structure()

Test Coverage (tests/test_schema_loader.py):
- 35 comprehensive unit tests (100% passing)
- Tests for loading, parsing, saving, round-trip conversion
- Edge case handling (empty files, binary files, malformed blocks)
- Fixed binary file test to use invalid UTF-8 sequences

Example Schema (markitect/schemas/manpage-schema-v1.0.md):
- First markdown schema following naming convention
- Complete manpage schema with frontmatter + documentation + JSON
- Demonstrates section classification and content control
- Shows proper structure for future schema authors

Documentation (roadmap/schema-of-schemas/SCHEMA_LOADER_GUIDE.md):
- Comprehensive user guide (600+ lines)
- API reference with examples
- Best practices and troubleshooting
- Integration patterns for CLI and validator

Progress Tracking:
- Updated TODO.md with Phase 2 completion
- Updated CHANGELOG.md with implementation details
- Next: Phase 3 - Schema-for-Schemas Metaschema

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
This commit is contained in:
2026-01-05 00:02:15 +01:00
parent 14108533fb
commit b81ce5631d
6 changed files with 2151 additions and 14 deletions

50
TODO.md
View File

@@ -12,33 +12,40 @@ The structure organizes **future tasks** by their impact, just as a changelog or
This section is for tasks currently being discussed with or worked on by the coding assistant. These are the ephemeral, flow-of-thought tasks.
### Schema-of-Schemas Implementation (Active - Phase 1)
### Schema-of-Schemas Implementation (Active - Phase 2)
**Status:** Phase 1 - Filename Convention & Validation (In Progress)
**Status:** Phase 2 - Markdown Schema Loader (Completed ✅)
**Workplan:** See `roadmap/schema-of-schemas/WORKPLAN.md`
**Current Goals:**
1. ✅ Establish naming convention: `{domain}-schema-v{major}.{minor}.md`
2. 🔄 Implement filename validation logic
3. 🔄 Update CLI with validation
4. Create markdown schema loader
5. ⏳ Build schema-for-schemas metaschema
2. Implement filename validation logic
3. ✅ Create markdown schema loader
4. Create example markdown schema
5. ⏳ Build schema-for-schemas metaschema (Next: Phase 3)
6. ⏳ Migrate existing schemas to new format
**Phase 1 Tasks (Completed ✅):**
- [x] Write `markitect/schema_naming.py` with validation logic
- [x] Add unit tests for filename validation (50 tests, 100% passing)
- [ ] Update `schema-ingest` command with validation (Next: Phase 2)
- [x] Create SCHEMA_NAMING_SPEC.md documentation
**Phase 2 Tasks (Completed ✅):**
- [x] Implement MarkdownSchemaLoader class (markitect/schema_loader.py, 515 lines)
- [x] Add frontmatter extraction (YAML)
- [x] Add JSON code block extraction with section preference
- [x] Add metadata merging with x-markitect-source tracking
- [x] Write comprehensive unit tests (35 tests, 100% passing)
- [x] Create example markdown schema (manpage-schema-v1.0.md)
- [x] Create SCHEMA_LOADER_GUIDE.md documentation
**Next Phases:**
- Phase 2: Markdown Schema Loader (2-3 days)
- Phase 3: Schema-for-Schemas Metaschema (2 days)
- Phase 4: Schema Migration (1-2 days)
- Phase 5: CLI & Documentation Updates (1 day)
- Phase 6: Testing & Validation (1 day)
**Expected Completion:** 8-10 days total
**Expected Completion:** 6-7 days remaining
---
@@ -131,6 +138,31 @@ The **capability-capability** includes:
- Includes content control and validation rules
- Full documentation and usage examples (README.md)
### 2026-01-04 - Phase 2: Markdown Schema Loader
- ✅ Implemented MarkdownSchemaLoader class (markitect/schema_loader.py, 515 lines)
- ✅ YAML frontmatter extraction with validation
- ✅ JSON code block extraction with "Schema Definition" section preference
- ✅ Metadata merging with x-markitect-source tracking
- ✅ Schema saving with template support and round-trip capability
- ✅ Comprehensive test suite (35 unit tests, 100% passing)
- ✅ Created example markdown schema (manpage-schema-v1.0.md)
- ✅ Created SCHEMA_LOADER_GUIDE.md with complete usage documentation
**Key Features Delivered:**
- Markdown-first schema format with embedded JSON
- Frontmatter metadata merges into schema ($id, version, status)
- Automatic detection of multiple JSON blocks
- Schema structure validation helper
- Error handling for binary files and invalid formats
- List JSON blocks helper for debugging
- Full round-trip save/load capability
**Example Markdown Schema:**
- manpage-schema-v1.0.md demonstrating complete format
- Includes frontmatter, documentation, and JSON schema
- Shows section classification and content control
- Follows naming convention: {domain}-schema-v{major}.{minor}.md
### 2025-12-17 - Architecture Refactoring
- ✅ Implemented ReusableCapabilitiesArchitecture v0.1
- ✅ Added feedback capability to issue-facade