chore: archive completed schema-of-schemas implementation
Some checks failed
Test Suite / unit-tests (3.11) (push) Has been cancelled
Test Suite / unit-tests (3.12) (push) Has been cancelled
Test Suite / integration-tests (push) Has been cancelled
Test Suite / e2e-tests (push) Has been cancelled
Test Suite / performance-tests (push) Has been cancelled
Test Suite / code-quality (push) Has been cancelled
Test Suite / security-scan (push) Has been cancelled
Test Suite / test-summary (push) Has been cancelled

Moved schema-of-schemas planning artifacts from roadmap to history
with datestamp prefix, marking completion of all 6 implementation phases.

**Changes:**
- Moved roadmap/schema-of-schemas/ → history/2026-01-05-schema-of-schemas/
- Updated all documentation references to new location
- Marked implementation as completed in TODO.md
- Updated CHANGELOG.md to reflect archived status

**Implementation Summary:**
All 6 phases completed successfully:
- Phase 1: Filename validation (50 tests)
- Phase 2: Markdown schema loader (35 tests)
- Phase 3: Schema-for-schemas metaschema (12 tests)
- Phase 4: Schema migration (2 migrated, 3 deleted)
- Phase 5: CLI enhancements (multi-schema validation)
- Phase 6: Integration testing and documentation

**Deliverables:**
- 97 unit tests (100% passing)
- 4 production schemas in registry
- Comprehensive user documentation
- Updated examples (manpages, terminology)
- Complete schema management system

The schema-of-schemas topic is now complete and archived for
historical reference.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
This commit is contained in:
2026-01-05 14:13:48 +01:00
parent d32dc41315
commit 3003b9b8da
11 changed files with 17 additions and 13 deletions

View File

@@ -1,94 +0,0 @@
# Schema-of-Schemas Implementation
**Project:** Markdown-First Schema System
**Status:** Planning → Implementation
**Timeline:** 8-10 days
## Quick Links
- **[WORKPLAN.md](./WORKPLAN.md)** - Detailed implementation plan with phases
- **[SCHEMA_MANAGEMENT_PROPOSAL.md](./SCHEMA_MANAGEMENT_PROPOSAL.md)** - Full analysis and options
- **[SCHEMA_MANAGEMENT_SUMMARY.md](./SCHEMA_MANAGEMENT_SUMMARY.md)** - Executive summary
## What We're Building
### Goals
1. ✅ Filename convention: `{domain}-schema-v{version}.md`
2. ✅ Markdown-first schema format (documentation + embedded JSON)
3. ✅ Schema-for-schemas to validate all schemas
4. ✅ Migrate existing schemas to new format
5. ✅ Clean up duplicate/legacy schemas
### Why
- **Consistency:** Enforced naming and versioning
- **Alignment:** Markdown-first matches MarkiTect philosophy
- **Documentation:** Rich docs alongside schemas
- **Validation:** Schema-for-schemas ensures quality
- **Maintainability:** Clear versions and structure
## Implementation Phases
1. **Phase 0:** Planning & Setup (0.5 days) ← **Current**
2. **Phase 1:** Filename Convention (1 day)
3. **Phase 2:** Markdown Loader (2-3 days)
4. **Phase 3:** Schema-for-Schemas (2 days)
5. **Phase 4:** Schema Migration (1-2 days)
6. **Phase 5:** CLI & Docs (1 day)
7. **Phase 6:** Testing (1 day)
## Current Status
### Completed
- [x] Directory structure created
- [x] Planning documents moved to roadmap
- [x] Comprehensive workplan written
- [x] Example markdown schema created
### Next Steps
1. Complete Phase 0 planning artifacts
2. Begin Phase 1 implementation
3. Checkpoint review after Phase 1
## Key Decisions
### Naming Convention
**Format:** `{domain}-schema-v{major}.{minor}.md`
**Example:** `manpage-schema-v1.0.md`
### Schema Format
**Markdown with embedded JSON:**
```markdown
---
schema-id: "https://markitect.dev/schemas/manpage/v1"
version: "1.0.0"
---
# Manpage Schema v1.0
[Documentation...]
## Schema Definition
```json
{ ... JSON schema ... }
```
```
### Schema Migration Plan
```
Old → New
──────────────────────────────────────────────────
terminology-schema.json → terminology-schema-v1.0.md
api-documentation → api-documentation-schema-v1.0.md
enhanced-manpage → manpage-schema-v2.0.md
markdown-manpage → REMOVE (duplicate)
markdown-manpage-schema.json → REMOVE (duplicate)
```
## Progress Tracking
Track progress in: `roadmap/schema-of-schemas/IMPLEMENTATION_LOG.md` (to be created)
## Questions?
See the full workplan for detailed implementation steps, risks, and mitigation strategies.

View File

@@ -1,579 +0,0 @@
# Markdown Schema Loader - User Guide
**Version:** 1.0
**Status:** Implemented
**Created:** 2026-01-04
## Overview
The Markdown Schema Loader enables MarkiTect to load JSON schemas from markdown files, combining rich documentation with machine-readable validation rules. This aligns with MarkiTect's markdown-first philosophy while maintaining JSON Schema compatibility.
## Markdown Schema Format
A markdown schema file consists of three parts:
1. **YAML Frontmatter**: Metadata about the schema
2. **Documentation**: Rich markdown content explaining the schema
3. **Schema Definition**: JSON schema in a code block
### Example Structure
```markdown
---
schema-id: "https://markitect.dev/schemas/domain/v1.0"
version: "1.0.0"
status: "stable"
---
# Schema Title v1.0
## Overview
Description of what this schema validates...
## Usage
How to use this schema...
## Schema Definition
```json
{
"$schema": "http://json-schema.org/draft-07/schema#",
"title": "My Schema",
"type": "object",
...
}
```
## Version History
- v1.0.0 - Initial version
```
## Frontmatter Metadata
### Required Fields
None are strictly required, but these are recommended:
| Field | Type | Description | Example |
|-------|------|-------------|---------|
| `schema-id` | string | Canonical URI for the schema | `https://markitect.dev/schemas/manpage/v1.0` |
| `version` | string | SemVer version | `1.0.0` |
| `status` | string | Lifecycle status | `stable`, `draft`, `deprecated` |
### Optional Fields
| Field | Type | Description |
|-------|------|-------------|
| `domain` | string | Schema domain name |
| `description` | string | Brief schema description |
| `authors` | array | List of authors |
| `created` | string | Creation date (ISO 8601) |
| `updated` | string | Last update date (ISO 8601) |
### Metadata Merging
Frontmatter metadata takes precedence over schema fields:
- `schema-id` → `$id` in the schema
- `version` → `version` in the schema
- `status` → `x-markitect-metadata.status` in the schema
All frontmatter is preserved in `x-markitect-source.frontmatter`.
## JSON Schema Extraction
### Schema Definition Section
The loader prefers JSON blocks under a `## Schema Definition` heading:
```markdown
## Schema Definition
```json
{
"$schema": "http://json-schema.org/draft-07/schema#",
...
}
```
```
### Fallback Behavior
If no `## Schema Definition` section exists, the loader uses the **first** JSON code block in the file.
### Multiple JSON Blocks
You can include multiple JSON blocks in documentation:
```markdown
## Example Usage
```json
{
"name": "example",
"version": "1.0"
}
```
## Schema Definition
```json
{
"$schema": "http://json-schema.org/draft-07/schema#",
"properties": {
"name": {"type": "string"},
"version": {"type": "string"}
}
}
```
```
The loader will use the schema under `## Schema Definition` heading.
## Using the Loader
### Python API
```python
from pathlib import Path
from markitect.schema_loader import MarkdownSchemaLoader
# Create loader instance
loader = MarkdownSchemaLoader()
# Load schema from markdown
schema_data = loader.load_schema(Path("manpage-schema-v1.0.md"))
# Access components
schema = schema_data['schema'] # JSON Schema dict
metadata = schema_data['metadata'] # Frontmatter dict
docs = schema_data['documentation'] # Full markdown content
source = schema_data['source_file'] # Source file path
# Use the schema
print(f"Loaded: {schema['title']}")
print(f"Version: {schema['version']}")
print(f"Status: {metadata['status']}")
```
### Loading from Markdown
```python
# Load schema
schema_data = loader.load_schema(Path("my-schema-v1.0.md"))
# Check for issues
issues = loader.validate_schema_structure(schema_data['schema'])
if issues:
for issue in issues:
print(f"⚠️ {issue}")
```
### Saving to Markdown
```python
# Create a schema
schema = {
"$schema": "http://json-schema.org/draft-07/schema#",
"title": "My Schema",
"version": "1.0.0",
"type": "object",
"properties": {
"name": {"type": "string"}
}
}
# Save as markdown
loader.save_schema(
schema=schema,
md_path=Path("my-schema-v1.0.md"),
frontmatter={
"schema-id": "https://example.com/schemas/my-schema/v1.0",
"status": "draft"
}
)
```
### Round-Trip Conversion
```python
# Load existing JSON schema
import json
json_schema = json.loads(Path("old-schema.json").read_text())
# Save as markdown
loader.save_schema(
schema=json_schema,
md_path=Path("new-schema-v1.0.md")
)
# Load it back
schema_data = loader.load_schema(Path("new-schema-v1.0.md"))
# Schemas are equivalent
assert schema_data['schema']['title'] == json_schema['title']
```
## Advanced Features
### Listing JSON Blocks
Useful for debugging when multiple JSON blocks exist:
```python
content = Path("schema.md").read_text()
blocks = loader.list_json_blocks(content)
print(f"Found {len(blocks)} JSON blocks:")
for position, json_content in blocks:
print(f" Position {position}: {len(json_content)} chars")
```
### Schema Structure Validation
Check for recommended fields and conventions:
```python
issues = loader.validate_schema_structure(schema)
for issue in issues:
print(f"⚠️ {issue}")
# Example output:
# ⚠️ Missing recommended field: $id
# ⚠️ Missing MarkiTect convention: version field
```
### Custom Templates
Use custom markdown templates for saving schemas:
```python
template = """---
{frontmatter_yaml}
---
# {title}
{description}
## Schema
```json
{schema_json}
```
"""
loader.save_schema(
schema=schema,
md_path=Path("custom-schema-v1.0.md"),
template=template
)
```
## Error Handling
### Common Errors
| Error | Cause | Solution |
|-------|-------|----------|
| `FileNotFoundError` | Schema file doesn't exist | Check file path |
| `SchemaNotFoundError` | No JSON block in markdown | Add ```json code block |
| `InvalidSchemaFormatError` | Invalid JSON or YAML | Check syntax |
| `SchemaFilenameError` | Invalid filename format | Use `{domain}-schema-v{major}.{minor}.md` |
### Example Error Handling
```python
from markitect.schema_loader import (
MarkdownSchemaLoader,
SchemaNotFoundError,
InvalidSchemaFormatError
)
loader = MarkdownSchemaLoader()
try:
schema_data = loader.load_schema(Path("my-schema.md"))
except FileNotFoundError as e:
print(f"❌ File not found: {e}")
except SchemaNotFoundError as e:
print(f"❌ No schema in file: {e}")
except InvalidSchemaFormatError as e:
print(f"❌ Invalid format: {e}")
```
## Best Practices
### 1. Use Schema Definition Section
Always place the main schema under `## Schema Definition`:
```markdown
## Schema Definition
```json
{...}
```
```
### 2. Include Frontmatter
Provide metadata for better discoverability:
```yaml
---
schema-id: "https://markitect.dev/schemas/domain/v1.0"
version: "1.0.0"
status: "stable"
---
```
### 3. Add Rich Documentation
Explain the schema purpose, usage, and examples:
```markdown
## Overview
This schema validates...
## Usage
```bash
markitect validate doc.md --schema my-schema-v1.0
```
## Examples
...
```
### 4. Version Your Schemas
Follow the naming convention:
- Initial: `my-schema-v1.0.md`
- Minor update: `my-schema-v1.1.md`
- Breaking change: `my-schema-v2.0.md`
### 5. Validate Structure
Always check for common issues:
```python
issues = loader.validate_schema_structure(schema)
if not issues:
print("✅ Schema structure is valid")
```
## Integration with MarkiTect
### CLI Usage (Future)
Once integrated with the CLI, you'll be able to:
```bash
# Ingest markdown schema
markitect schema-ingest manpage-schema-v1.0.md
# Validate against markdown schema
markitect validate document.md --schema manpage-schema-v1.0
# Export schema
markitect schema-get manpage-schema-v1.0 --output json
```
### Validator Integration
The SchemaValidator will automatically detect `.md` schemas:
```python
from markitect.validator import SchemaValidator
validator = SchemaValidator()
validator.validate(
document="my-doc.md",
schema="manpage-schema-v1.0.md" # .md extension auto-detected
)
```
## Markdown Schema Template
Here's a complete template for creating new schemas:
```markdown
---
schema-id: "https://markitect.dev/schemas/YOUR-DOMAIN/v1.0"
version: "1.0.0"
status: "draft"
domain: "YOUR-DOMAIN"
description: "Brief description of what this schema validates"
authors:
- "Your Name <email@example.com>"
created: "2026-01-04"
---
# YOUR-DOMAIN Schema v1.0
## Overview
Detailed description of what this schema validates and why it exists.
## Features
- Feature 1
- Feature 2
- Feature 3
## Usage
### Validating Documents
```bash
markitect validate document.md --schema YOUR-DOMAIN-schema-v1.0
```
### Common Validation Errors
1. **Error Type 1**: Description and solution
2. **Error Type 2**: Description and solution
## Schema Definition
```json
{
"$schema": "http://json-schema.org/draft-07/schema#",
"title": "YOUR DOMAIN Schema",
"description": "Schema description",
"type": "object",
"properties": {
"field1": {
"type": "string",
"description": "Description of field1"
}
},
"required": ["field1"]
}
```
## Examples
### Valid Document
```markdown
Example of valid content...
```
### Invalid Document
```markdown
Example of invalid content...
```
## Version History
### v1.0.0 (2026-01-04)
- Initial version
- Feature A
- Feature B
## Related Documentation
- [Related Schema 1](../other-schema-v1.0.md)
- [MarkiTect Documentation](../../README.md)
```
## Testing
The loader has comprehensive test coverage:
```bash
# Run all loader tests
pytest tests/test_schema_loader.py -v
# Run specific test class
pytest tests/test_schema_loader.py::TestMarkdownSchemaLoader -v
# Check coverage
pytest tests/test_schema_loader.py --cov=markitect.schema_loader
```
**Test Results**: 35/35 tests passing (100%)
## Implementation Details
### Regex Patterns
The loader uses these regex patterns:
```python
# Frontmatter pattern
r'^---\s*\n(.*?)\n---\s*\n'
# JSON code block pattern
r'```json\s*\n(.*?)\n```'
# Schema Definition section pattern
r'##\s+Schema Definition\s*\n'
```
### Metadata Merging
The `_merge_metadata` method:
1. Copies the original schema
2. Adds `x-markitect-source` with file metadata
3. Merges frontmatter fields:
- `schema-id``$id`
- `version``version`
- `status``x-markitect-metadata.status`
### File Encoding
All files are read/written as UTF-8. Invalid UTF-8 sequences raise `InvalidSchemaFormatError`.
## Troubleshooting
### Schema Not Found
**Problem**: `SchemaNotFoundError: No JSON schema found`
**Solutions**:
- Ensure you have a ```json code block
- Check the JSON syntax is valid
- Verify the code block is properly closed with ```
### Invalid YAML Frontmatter
**Problem**: `InvalidSchemaFormatError: Invalid YAML frontmatter`
**Solutions**:
- Check YAML syntax (indentation, colons, quotes)
- Ensure frontmatter is between `---` delimiters
- Verify frontmatter is at the start of file
### Binary File Error
**Problem**: `InvalidSchemaFormatError: Failed to read schema file`
**Solutions**:
- Ensure file is text, not binary
- Check file encoding is UTF-8
- Verify file isn't corrupted
## See Also
- [Schema Naming Specification](SCHEMA_NAMING_SPEC.md)
- [Schema Management Workplan](WORKPLAN.md)
- [Phase 2 Documentation](WORKPLAN.md#phase-2-markdown-schema-loader)
- [Example Markdown Schema](../../markitect/schemas/manpage-schema-v1.0.md)
## Changelog
### v1.0.0 (2026-01-04)
- Initial implementation
- 35 unit tests (100% passing)
- Frontmatter extraction with YAML parsing
- JSON code block extraction with section preference
- Metadata merging with x-markitect-source tracking
- Schema saving with template support
- Round-trip save/load capability
- Helper methods for validation and debugging

View File

@@ -1,569 +0,0 @@
# Schema Management Proposal
**Status:** Draft
**Created:** 2026-01-04
**Author:** Analysis of current state and proposed improvements
## Problem Statement
### 1. Inconsistent Schema Naming
**Current State:**
```
terminology-schema.json ← Has ".json" suffix
api-documentation ← No suffix
enhanced-manpage ← No suffix
markdown-manpage ← No suffix, duplicate title
markdown-manpage-schema.json ← Has ".json" suffix, duplicate title
```
**Issues:**
- No naming convention enforced
- Duplicate schemas (3 manpage schemas!)
- Mix of suffixed (.json) and non-suffixed names
- No way to distinguish versions
### 2. Missing Versioning
**Current State:**
- No version in filenames
- No version in schema metadata (beyond optional `$id`)
- No way to track schema evolution
- Breaking changes not apparent
**Issues:**
- Can't have multiple versions simultaneously
- No migration path when schemas change
- Unclear which schema version a document uses
### 3. Format Mismatch: JSON vs Markdown
**The Philosophical Problem:**
> MarkiTect is a markdown-centric tool, yet schemas are JSON files.
> This creates a conceptual and practical mismatch.
**Current State:**
- Documents: Markdown (.md)
- Schemas: JSON (.json)
- No unified format for documentation + schema
- Schemas lack rich documentation capabilities
## Proposed Solutions
### Part 1: Naming Convention & Versioning
#### Option A: Filename-Based Versioning (Recommended)
**Format:** `{domain}-{type}-schema-v{major}.{minor}.json`
**Examples:**
```
manpage-schema-v1.0.json # Manpage schema v1.0
manpage-schema-v2.0.json # Breaking change → v2.0
terminology-schema-v1.0.json # Terminology schema
api-documentation-schema-v1.0.json
arc42-schema-v1.0.json
```
**Benefits:**
- Clear versioning in filename
- Easy to see multiple versions
- SemVer compatible (major.minor)
- Searchable/sortable
**Migration Strategy:**
```bash
# Rename existing schemas
markdown-manpage → manpage-schema-v1.0.json
enhanced-manpage → manpage-schema-v2.0.json # (breaking changes)
terminology-schema.json → terminology-schema-v1.0.json
```
#### Option B: $id-Based Versioning
**Keep simple filenames, use `$id` for versioning:**
```json
{
"$id": "https://markitect.dev/schemas/manpage/v1",
"$schema": "http://json-schema.org/draft-07/schema#",
"version": "1.0.0",
...
}
```
**Filenames:** `manpage-schema.json`, `terminology-schema.json`
**Benefits:**
- Clean filenames
- Versioning in metadata
- Follows JSON Schema best practices
**Drawbacks:**
- Can't have multiple versions in same database
- Harder to see versions at a glance
#### Recommendation: **Hybrid Approach**
Combine both for maximum clarity:
```json
// File: manpage-schema-v1.json
{
"$id": "https://markitect.dev/schemas/manpage/v1",
"$schema": "http://json-schema.org/draft-07/schema#",
"version": "1.0.0",
"title": "Unix Manual Page Schema",
...
}
```
### Part 2: Schema Metadata Standard
Add required metadata to all schemas:
```json
{
"$schema": "http://json-schema.org/draft-07/schema#",
"$id": "https://markitect.dev/schemas/{domain}/v{major}",
// Required metadata
"version": "1.0.0", // SemVer
"title": "Human Readable Title",
"description": "Detailed description",
// Optional metadata
"x-markitect-schema-type": "document-schema",
"x-markitect-version": {
"major": 1,
"minor": 0,
"patch": 0
},
"x-markitect-author": "MarkiTect Project",
"x-markitect-created": "2026-01-04",
"x-markitect-updated": "2026-01-04",
"x-markitect-deprecated": false,
"x-markitect-superseded-by": null,
"x-markitect-document-types": ["manpage", "manual"],
"x-markitect-example": "examples/manpages/example.md",
// Schema content
"type": "object",
"properties": { ... }
}
```
### Part 3: Format Mismatch Solutions
#### Option 1: Markdown-First with Embedded JSON (Recommended)
**File Format:** Markdown with frontmatter and JSON code block
```markdown
---
schema-version: "1.0.0"
schema-id: "https://markitect.dev/schemas/manpage/v1"
document-type: manpage
status: stable
---
# Manpage Schema v1.0
## Overview
This schema validates Unix/Linux manual page documentation following
standard conventions (SYNOPSIS, DESCRIPTION, OPTIONS, etc.).
## Document Types
- Manual pages (man pages)
- CLI command documentation
- API reference pages
## Usage
\`\`\`bash
markitect validate mycommand.1.md --schema manpage-schema-v1
\`\`\`
## Examples
See [examples/manpages/](../../examples/manpages/) for complete examples.
## Schema Definition
\`\`\`json
{
"$schema": "http://json-schema.org/draft-07/schema#",
"$id": "https://markitect.dev/schemas/manpage/v1",
"version": "1.0.0",
"title": "Unix Manual Page Schema",
"type": "object",
"properties": {
"headings": {
"type": "object",
"properties": {
"level_1": { ... }
}
}
},
"x-markitect-sections": { ... }
}
\`\`\`
## Validation Rules
### Required Sections
- **NAME** - Command name and brief description
- **SYNOPSIS** - Command syntax
- **DESCRIPTION** - Detailed description
### Optional Sections
- **OPTIONS** - Command-line options
- **EXAMPLES** - Usage examples
- **SEE ALSO** - Related commands
## Version History
### v1.0.0 (2026-01-04)
- Initial release
- Basic manpage structure validation
```
**Implementation:**
```python
class MarkdownSchemaLoader:
"""Load schemas from markdown files with embedded JSON."""
def load_schema_from_markdown(self, md_path: Path) -> dict:
"""Extract JSON schema from markdown file."""
content = md_path.read_text()
# Parse frontmatter
frontmatter = self._extract_frontmatter(content)
# Extract JSON from code block
schema_json = self._extract_json_from_code_block(content)
# Merge metadata
schema = json.loads(schema_json)
schema['x-markitect-metadata'] = frontmatter
return schema
def save_schema_to_markdown(self, schema: dict, md_path: Path):
"""Save schema as markdown with embedded JSON."""
# Generate markdown documentation
doc = self._generate_schema_documentation(schema)
# Embed JSON schema
json_block = f"```json\n{json.dumps(schema, indent=2)}\n```"
# Combine
full_content = f"{doc}\n\n## Schema Definition\n\n{json_block}"
md_path.write_text(full_content)
```
**Benefits:**
- ✅ Markdown-first (aligns with MarkiTect philosophy)
- ✅ Rich documentation alongside schema
- ✅ Human-readable and editable
- ✅ Version history in same file
- ✅ Examples and usage inline
- ✅ Can extract JSON when needed
**Drawbacks:**
- ⚠️ Requires parsing logic
- ⚠️ Two sources of truth (markdown + embedded JSON)
- ⚠️ More complex than pure JSON
#### Option 2: Markdown Documentation Generator
**Keep JSON schemas, auto-generate markdown docs:**
```
schemas/
manpage-schema-v1.json # Source of truth
manpage-schema-v1.md # Auto-generated docs
```
**Command:**
```bash
markitect schema-document manpage-schema-v1.json
# Generates: manpage-schema-v1.md
```
**Benefits:**
- ✅ Simple implementation
- ✅ JSON remains source of truth
- ✅ Auto-generated docs always in sync
**Drawbacks:**
- ⚠️ Two files to manage
- ⚠️ Can't hand-edit documentation (gets overwritten)
#### Option 3: Markdown Schema Language (DSL)
**Define schemas in markdown-native syntax:**
```markdown
# Manpage Schema v1.0
## Document Structure
### Required Sections (Level 1 Heading)
**NAME**
- Classification: required
- Content: Command name in bold, followed by description
- Pattern: `**command** - description`
**SYNOPSIS**
- Classification: required
- Content: Command syntax with options
- Min paragraphs: 1
- Max paragraphs: 3
### Optional Sections
**OPTIONS**
- Classification: recommended
- Content: Definition list of command-line options
```
**Parser generates JSON schema from markdown:**
```bash
markitect schema-compile manpage-schema-v1.md --output manpage-schema-v1.json
```
**Benefits:**
- ✅ Pure markdown
- ✅ Human-friendly syntax
- ✅ No JSON editing needed
**Drawbacks:**
- ⚠️ Complex parser implementation
- ⚠️ Limited to MarkiTect-specific features
- ⚠️ Can't use standard JSON Schema tools
#### Option 4: Literate Schema Programming
**Inspired by literate programming, mix documentation and schema:**
```markdown
# Manpage Schema v1.0
Manual pages follow a standard structure. The NAME section is required:
<<define-name-section>>=
{
"NAME": {
"classification": "required",
"heading_level": 2,
"content_instruction": "Command name and brief description"
}
}
The SYNOPSIS section shows command syntax:
<<define-synopsis-section>>=
{
"SYNOPSIS": {
"classification": "required",
"heading_level": 2,
"min_code_blocks": 1
}
}
Complete schema:
<<manpage-schema.json>>=
{
"$schema": "http://json-schema.org/draft-07/schema#",
"x-markitect-sections": {
<<define-name-section>>,
<<define-synopsis-section>>
}
}
```
**Benefits:**
- ✅ Documentation and schema interleaved
- ✅ Literate programming benefits
- ✅ Reusable schema fragments
**Drawbacks:**
- ⚠️ Complex tangling/weaving
- ⚠️ Unfamiliar paradigm
- ⚠️ Overkill for simple schemas
## Recommendations
### Short-Term (Immediate)
1. **Naming Convention:**
- Format: `{domain}-schema-v{major}.{minor}.json`
- Example: `manpage-schema-v1.0.json`
2. **Schema Metadata:**
- Add required `version`, `title`, `description` fields
- Add `x-markitect-*` metadata extensions
- Document in schema-catalog.yaml
3. **Duplicate Cleanup:**
- Consolidate 3 manpage schemas into versioned series
- Keep enhanced-manpage as v2.0 (breaking changes)
- Archive old schemas
### Medium-Term (Next Phase)
4. **Markdown Schema Format (Option 1):**
- Implement markdown-first schema format
- Markdown file with embedded JSON in code block
- Parser extracts JSON for validation
- Rich documentation alongside schema
5. **Schema Documentation Generator:**
- Auto-generate markdown docs from JSON schemas
- Include examples, usage, version history
- Link to example documents
### Long-Term (Future)
6. **Schema DSL (Option 3):**
- Evaluate markdown schema language
- Prototype parser for common patterns
- Consider if DSL adds value over JSON
7. **Schema Registry API:**
- REST API for schema discovery
- Version negotiation
- Schema evolution tracking
## Implementation Plan
### Phase 1: Naming & Versioning (1-2 days)
**Tasks:**
1. Define naming convention spec
2. Create schema metadata template
3. Rename existing schemas
4. Update schema-catalog.yaml
5. Update documentation
**Deliverables:**
- Schema naming convention spec
- Migrated schemas with versions
- Updated catalog
### Phase 2: Markdown Schema Format (3-5 days)
**Tasks:**
1. Design markdown schema format
2. Implement parser (extract JSON from markdown)
3. Implement generator (create markdown from JSON)
4. Convert existing schemas to markdown format
5. Update CLI to support .md schemas
6. Write documentation and examples
**Deliverables:**
- Markdown schema parser/generator
- All schemas in markdown format
- Updated CLI commands
- Migration guide
### Phase 3: Schema Validation (2-3 days)
**Tasks:**
1. Create metaschema for validating schemas
2. Add schema validation command
3. Validate all existing schemas
4. Add CI check for schema validity
**Deliverables:**
- Schema-for-schemas (metaschema)
- Validation command
- CI integration
## Cost-Benefit Analysis
### Option 1: Markdown-First (Recommended)
**Cost:**
- Parser implementation: ~200 lines
- CLI updates: ~100 lines
- Migration effort: 2-3 days
- Testing: 1 day
**Benefit:**
- Aligned with markdown philosophy ⭐⭐⭐⭐⭐
- Rich documentation ⭐⭐⭐⭐⭐
- Version history inline ⭐⭐⭐⭐
- Human-friendly ⭐⭐⭐⭐⭐
- Lower barrier to entry ⭐⭐⭐⭐
**Total:** High value for reasonable cost
### Option 2: Documentation Generator
**Cost:**
- Generator implementation: ~150 lines
- Template design: 1 day
- Testing: 0.5 days
**Benefit:**
- Simple implementation ⭐⭐⭐⭐
- Auto-sync docs ⭐⭐⭐⭐
- JSON remains source ⭐⭐⭐
**Total:** Good value, lower cost
### Option 3: Schema DSL
**Cost:**
- DSL design: 2-3 days
- Parser implementation: ~500 lines
- Compiler: ~300 lines
- Testing: 2 days
- Documentation: 1 day
**Benefit:**
- Pure markdown ⭐⭐⭐⭐⭐
- No JSON editing ⭐⭐⭐⭐
- Limited ecosystem ⭐⭐
**Total:** High cost, uncertain value
## Decision Matrix
| Criterion | Option 1: Markdown-First | Option 2: Doc Generator | Option 3: DSL |
|-----------|-------------------------|------------------------|---------------|
| Markdown alignment | ⭐⭐⭐⭐⭐ | ⭐⭐⭐ | ⭐⭐⭐⭐⭐ |
| Implementation cost | ⭐⭐⭐⭐ | ⭐⭐⭐⭐⭐ | ⭐⭐ |
| Documentation quality | ⭐⭐⭐⭐⭐ | ⭐⭐⭐⭐ | ⭐⭐⭐⭐ |
| Tool ecosystem | ⭐⭐⭐⭐ | ⭐⭐⭐⭐⭐ | ⭐⭐ |
| Maintainability | ⭐⭐⭐⭐ | ⭐⭐⭐⭐⭐ | ⭐⭐⭐ |
| User-friendliness | ⭐⭐⭐⭐⭐ | ⭐⭐⭐ | ⭐⭐⭐⭐ |
## Recommendation Summary
1. **Immediate:** Implement naming convention and versioning
2. **Short-term:** Choose **Option 1 (Markdown-First)** for schema format
3. **Fallback:** If Option 1 proves too complex, use **Option 2 (Doc Generator)**
4. **Future:** Evaluate DSL if community demand emerges
## Next Steps
1. Review and approve this proposal
2. Create naming convention specification
3. Prototype markdown schema parser
4. Migrate one schema as proof-of-concept
5. Gather feedback and iterate
6. Full migration of all schemas
---
## Appendix: Example Markdown Schema
See `examples/schemas/manpage-schema-v1.md` for a complete example of the proposed format.

View File

@@ -1,154 +0,0 @@
# Schema Management: Executive Summary
**TL;DR:** Implement naming conventions, versioning, and markdown-first schema format to solve current schema management issues.
## Problems Identified
1. **Inconsistent Naming** - Mix of `schema.json` suffix and no suffix
2. **No Versioning** - Can't track schema evolution or maintain multiple versions
3. **Duplicate Schemas** - 3 manpage schemas with similar content
4. **Format Mismatch** - JSON schemas in markdown-centric tool
## Recommended Solution
### 1. Naming Convention (Immediate)
**Format:** `{domain}-schema-v{major}.{minor}.json` or `.md`
**Examples:**
```
manpage-schema-v1.0.json
terminology-schema-v1.0.json
api-documentation-schema-v1.0.json
```
**Migration:**
```
markdown-manpage → manpage-schema-v1.0.json
enhanced-manpage → manpage-schema-v2.0.json (breaking changes)
terminology-schema.json → terminology-schema-v1.0.json
```
### 2. Markdown-First Format (Short-term)
**Proposal:** Store schemas as markdown files with embedded JSON
**Benefits:**
- Aligns with markdown philosophy ✅
- Rich documentation alongside schema ✅
- Version history in same file ✅
- Examples and usage inline ✅
- Lower barrier to entry ✅
**Example:** See `examples/schemas/manpage-schema-v1.md`
**Format:**
```markdown
# Schema Title v1.0
## Documentation sections...
## Schema Definition
\`\`\`json
{ schema here }
\`\`\`
```
### 3. Schema Metadata Standard (Immediate)
**Required fields:**
```json
{
"$schema": "http://json-schema.org/draft-07/schema#",
"$id": "https://markitect.dev/schemas/{domain}/v{major}",
"version": "1.0.0",
"title": "Human Readable Title",
"description": "Detailed description",
"x-markitect-metadata": {
"domain": "manpage",
"document-types": ["manual-page"],
"created": "2026-01-04",
"example": "examples/manpages/example.md"
}
}
```
## Implementation Phases
### Phase 1: Foundation (1-2 days)
- [x] Analyze current state
- [ ] Define naming convention spec
- [ ] Create schema metadata template
- [ ] Rename existing schemas
- [ ] Update schema-catalog.yaml
### Phase 2: Markdown Format (3-5 days)
- [ ] Design markdown schema format
- [ ] Implement parser (extract JSON from markdown)
- [ ] Convert 1 schema as proof-of-concept
- [ ] Test and iterate
- [ ] Migrate all schemas
### Phase 3: Tooling (2-3 days)
- [ ] Update CLI to support .md schemas
- [ ] Add schema validation command
- [ ] Create migration guide
- [ ] Update documentation
## Cost-Benefit Analysis
**Cost:** 6-10 days total effort
**Benefits:**
- Professional schema management ⭐⭐⭐⭐⭐
- Better discoverability ⭐⭐⭐⭐
- Easier maintenance ⭐⭐⭐⭐⭐
- Markdown alignment ⭐⭐⭐⭐⭐
- Version tracking ⭐⭐⭐⭐⭐
**ROI:** High - Foundational improvement that benefits all future schema work
## Alternative Considered
**Alternative:** Keep JSON, generate markdown docs automatically
**Pros:**
- Simpler implementation (2-3 days)
- JSON remains source of truth
- Standard tooling works
**Cons:**
- Doesn't solve format mismatch
- Documentation generated, not authored
- Two files to manage
**Verdict:** Markdown-first better aligns with project philosophy
## Quick Wins (Today)
1. **Rename schemas** with versioned names (30 minutes)
2. **Add metadata** to existing schemas (1 hour)
3. **Update catalog** with proper versioning (30 minutes)
## Questions to Resolve
1. **File extension:** `.md` or `.schema.md` for markdown schemas?
2. **JSON extraction:** Real-time or pre-compiled cache?
3. **Backward compatibility:** Support both formats during transition?
4. **CLI changes:** `--schema file.md` or auto-detect format?
## Next Steps
1. **Review** this proposal and example (examples/schemas/manpage-schema-v1.md)
2. **Decide** on markdown-first vs generated docs approach
3. **Prototype** parser for markdown schemas
4. **Migrate** one schema as proof-of-concept
5. **Iterate** based on feedback
6. **Full rollout** to all schemas
## References
- Full proposal: [SCHEMA_MANAGEMENT_PROPOSAL.md](./SCHEMA_MANAGEMENT_PROPOSAL.md)
- Example markdown schema: [examples/schemas/manpage-schema-v1.md](../../examples/schemas/manpage-schema-v1.md)
- Current schema catalog: [markitect/schemas/schema-catalog.yaml](../../markitect/schemas/schema-catalog.yaml)

View File

@@ -1,408 +0,0 @@
# Schema Naming Convention Specification
**Version:** 1.0
**Status:** Implemented
**Created:** 2026-01-04
## Overview
This specification defines the filename convention for all MarkiTect schema files to ensure consistency, discoverability, and version tracking across the schema ecosystem.
## Filename Format
### Standard Format
```
{domain}-schema-v{major}.{minor}.md
```
### Components
| Component | Description | Rules | Examples |
|-----------|-------------|-------|----------|
| **domain** | Schema domain identifier | - Lowercase only<br>- Start with letter<br>- Letters, numbers, hyphens<br>- No consecutive hyphens<br>- No leading/trailing hyphens | `manpage`<br>`api-documentation`<br>`arc42` |
| **schema** | Literal keyword | - Must be exactly `schema` | `schema` |
| **version** | SemVer major.minor | - Format: `v{major}.{minor}`<br>- Non-negative integers<br>- Must include both major and minor | `v1.0`<br>`v2.5`<br>`v10.25` |
| **extension** | File extension | - Must be `.md` (markdown) | `.md` |
### Regular Expression
```regex
^[a-z][a-z0-9-]*-schema-v\d+\.\d+\.md$
```
**Breakdown:**
- `^[a-z]` - Start with lowercase letter
- `[a-z0-9-]*` - Followed by lowercase letters, numbers, or hyphens
- `-schema-` - Literal string
- `v\d+\.\d+` - Version (v + digits + dot + digits)
- `\.md$` - Extension
## Valid Examples
### Simple Domains
```
manpage-schema-v1.0.md
terminology-schema-v1.0.md
glossary-schema-v1.0.md
```
### Multi-Word Domains
```
api-documentation-schema-v1.0.md
architecture-decision-record-schema-v1.0.md
software-requirements-specification-schema-v1.0.md
```
### With Numbers
```
arc42-schema-v1.0.md
rfc2119-keywords-schema-v1.0.md
iso27001-schema-v1.0.md
```
### Version Variations
```
manpage-schema-v1.0.md # Initial version
manpage-schema-v1.1.md # Minor update
manpage-schema-v2.0.md # Breaking change
manpage-schema-v10.25.md # Double-digit versions
```
## Invalid Examples
### Wrong Extension
```
❌ manpage-schema-v1.0.json # Must be .md
❌ manpage-schema-v1.0.yaml # Must be .md
❌ manpage-schema-v1.0 # Missing extension
```
### Missing Components
```
❌ manpage-v1.0.md # Missing "schema" keyword
❌ manpage-schema.md # Missing version
❌ manpage.md # Missing "schema" and version
```
### Version Format Errors
```
❌ manpage-schema-1.0.md # Missing 'v' prefix
❌ manpage-schema-v1.md # Missing minor version
❌ manpage-schema-v1.0.0.md # Too many version parts (patch not used)
❌ manpage-schema-v1-0.md # Hyphen instead of dot
```
### Case Errors
```
❌ ManPage-schema-v1.0.md # Uppercase in domain
❌ manpage-Schema-v1.0.md # Uppercase in keyword
❌ MANPAGE-SCHEMA-V1.0.MD # All uppercase
```
### Domain Format Errors
```
❌ 42answers-schema-v1.0.md # Starts with number
❌ -manpage-schema-v1.0.md # Starts with hyphen
❌ man_page-schema-v1.0.md # Underscore (use hyphen)
❌ man page-schema-v1.0.md # Space (use hyphen)
❌ my--schema-v1.0.md # Consecutive hyphens
```
## Version Numbering Guidelines
### Semantic Versioning
We use simplified SemVer with major.minor only:
**Major Version (X.0):**
- Breaking changes to schema structure
- Incompatible with previous version
- Documents validated against v1.0 may fail v2.0
**Examples:**
- `manpage-schema-v1.0.md``manpage-schema-v2.0.md` (breaking change)
- `api-schema-v1.0.md``api-schema-v2.0.md` (new required sections)
**Minor Version (X.Y):**
- Backward-compatible additions
- New optional sections or fields
- Relaxed constraints
- Documents validated against v1.0 still validate against v1.1
**Examples:**
- `manpage-schema-v1.0.md``manpage-schema-v1.1.md` (new optional section)
- `api-schema-v2.0.md``api-schema-v2.1.md` (additional metadata)
### Version Incrementing
```
v1.0 → v1.1 → v1.2 → ... → v1.9 → v1.10 → v1.11
v2.0 (breaking change)
```
### Initial Version
All new schemas start at `v1.0.md`:
```bash
# New schema
my-new-type-schema-v1.0.md
```
## Domain Naming Guidelines
### Good Domain Names
**Descriptive and Specific:**
```
✓ manpage-schema-v1.0.md # Clear: Unix manual pages
✓ api-documentation-schema-v1.0.md # Clear: API docs
✓ architecture-decision-record-schema-v1.0.md # Full ADR name
```
**Concise but Meaningful:**
```
✓ adr-schema-v1.0.md # Common abbreviation
✓ rfc-schema-v1.0.md # Well-known acronym
✓ arc42-schema-v1.0.md # Standard name
```
### Poor Domain Names
**Too Generic:**
```
❌ document-schema-v1.0.md # Too vague
❌ markdown-schema-v1.0.md # All schemas are markdown
❌ schema-schema-v1.0.md # Redundant (use "metaschema")
```
**Too Verbose:**
```
❌ my-custom-documentation-template-for-apis-v1.0.md # Too long
→ api-documentation-schema-v1.0.md # Better
```
**Unclear Abbreviations:**
```
❌ mt-schema-v1.0.md # What is "mt"?
❌ doc-schema-v1.0.md # Too generic
```
## Normalization Rules
When converting arbitrary strings to valid domain names:
1. **Convert to lowercase**
- `API Documentation``api documentation`
2. **Replace separators with hyphens**
- Spaces: `api documentation``api-documentation`
- Underscores: `my_type``my-type`
- Multiple separators: `my type``my--type`
3. **Remove consecutive hyphens**
- `my--type``my-type`
4. **Remove leading/trailing hyphens**
- `-my-type-``my-type`
5. **Validate result**
- Must start with letter
- Only lowercase letters, numbers, hyphens
### Example Normalizations
```python
"API Documentation" "api-documentation-schema-v1.0.md"
"My_Custom_Type" "my-custom-type-schema-v1.0.md"
"arc42 Architecture" "arc42-architecture-schema-v1.0.md"
"--leading-hyphen" "leading-hyphen-schema-v1.0.md"
```
## Implementation
### Validation Function
The naming convention is enforced by `markitect.schema_naming.validate_schema_filename()`:
```python
from markitect.schema_naming import validate_schema_filename
is_valid, metadata = validate_schema_filename("manpage-schema-v1.0.md")
if is_valid:
print(f"Domain: {metadata['domain']}")
print(f"Version: {metadata['version']}")
print(f"Major: {metadata['major']}, Minor: {metadata['minor']}")
```
### Suggestion Function
Generate valid filenames from arbitrary input:
```python
from markitect.schema_naming import suggest_schema_filename
# From clean input
filename = suggest_schema_filename("manpage", "1.0")
# → "manpage-schema-v1.0.md"
# From messy input (with normalization)
filename = suggest_schema_filename("API Documentation", "2.1")
# → "api-documentation-schema-v1.0.md"
```
### CLI Integration
The `schema-ingest` command validates filenames:
```bash
# Valid filename - accepted
$ markitect schema-ingest manpage-schema-v1.0.md
✅ Schema stored successfully
# Invalid filename - rejected (unless --force)
$ markitect schema-ingest manpage.json
❌ Invalid schema filename: manpage.json
Expected format: {domain}-schema-v{major}.{minor}.md
Example: manpage-schema-v1.0.md
Suggested filename: manpage-schema-v1.0.md
Use --force to skip validation
```
## Migration from Legacy Naming
### Current State Analysis
Existing schemas with inconsistent naming:
```
terminology-schema.json # Has .json extension
api-documentation # No version, no extension
enhanced-manpage # No version, no extension, unclear name
markdown-manpage # No version, no extension
markdown-manpage-schema.json # Has .json extension
```
### Migration Strategy
1. **Identify domain and version**
2. **Apply naming convention**
3. **Update database registration**
4. **Remove legacy entries**
### Migration Mapping
```
Old Name → New Name
────────────────────────────────────────────────────────────────
terminology-schema.json → terminology-schema-v1.0.md
api-documentation → api-documentation-schema-v1.0.md
enhanced-manpage → manpage-schema-v2.0.md
markdown-manpage → (DELETE - duplicate)
markdown-manpage-schema.json → (DELETE - duplicate)
```
**Rationale:**
- `enhanced-manpage` → v2.0 (has breaking changes: classification system)
- `markdown-manpage` variants → DELETE (superseded by v1.0 and v2.0)
## Special Cases
### Metaschema
The schema-for-schemas follows the same convention:
```
schema-schema-v1.0.md
```
Domain is `schema`, indicating it validates schemas themselves.
### Multiple Schemas for Same Domain
Use version numbers to distinguish:
```
manpage-schema-v1.0.md # Original
manpage-schema-v2.0.md # Enhanced with classifications
```
Or use more specific domain names:
```
manpage-simple-schema-v1.0.md # Simplified variant
manpage-extended-schema-v1.0.md # Extended variant
```
## Validation Testing
All schemas should pass the naming convention validation:
```bash
# Test a filename
python -c "
from markitect.schema_naming import is_valid_schema_filename
print(is_valid_schema_filename('manpage-schema-v1.0.md'))
"
# → True
# Get detailed errors
python -c "
from markitect.schema_naming import get_validation_errors
errors = get_validation_errors('invalid.json')
for error in errors:
print(error)
"
```
## Benefits
### Consistency
- All schemas follow same pattern
- Easy to recognize schema files
- Predictable naming
### Versioning
- Clear version tracking
- Multiple versions can coexist
- Breaking changes explicit (major version bump)
### Discoverability
- Glob patterns work: `*-schema-v*.md`
- Easy to list all schemas: `ls *-schema-*.md`
- Domain easily extractable
### Tooling
- Programmatic validation
- Automatic suggestion
- Migration support
## References
- **Implementation:** `markitect/schema_naming.py`
- **Tests:** `tests/test_schema_naming.py`
- **Workplan:** `roadmap/schema-of-schemas/WORKPLAN.md`
- **Examples:** `examples/schemas/manpage-schema-v1.0.md`
## Changelog
### v1.0 (2026-01-04)
- Initial specification
- Implemented validation and suggestion functions
- 50 unit tests (100% passing)
- CLI integration planned

View File

@@ -1,962 +0,0 @@
# Schema-of-Schemas Implementation Workplan
**Project:** Implement Markdown-First Schema System with Self-Description
**Created:** 2026-01-04
**Status:** Planning
**Duration:** 6-10 days
**Priority:** High - Foundation for all schema work
## Executive Summary
This workplan implements a comprehensive schema management system:
1. Filename conventions and versioning
2. Markdown-first schema format (`.md` with embedded JSON)
3. Schema-for-schemas (metaschema) for validation
4. Migration of existing schemas
5. Cleanup of legacy schemas from registry
## Project Goals
### Primary Goals
- [x] Establish filename convention: `{domain}-schema-v{version}.md`
- [ ] Implement markdown schema parser (extract JSON from markdown)
- [ ] Create schema-for-schemas to validate all schemas
- [ ] Migrate existing schemas to new format
- [ ] Remove legacy/duplicate schemas from registry
### Success Criteria
- ✅ All schemas follow naming convention
- ✅ Schemas stored as markdown files with embedded JSON
- ✅ Schema-for-schemas validates all schemas successfully
- ✅ No duplicate schemas in registry
- ✅ CLI commands work with `.md` schema files
- ✅ Documentation updated
## Architecture Overview
### Current State
```
Schemas: JSON files (.json)
Naming: Inconsistent (api-documentation, markdown-manpage-schema.json)
Versioning: None
Documentation: Separate or missing
Registry: Database with 5 schemas (3 duplicates)
```
### Target State
```
Schemas: Markdown files (.md) with embedded JSON
Naming: {domain}-schema-v{major}.{minor}.md
Versioning: SemVer in filename and metadata
Documentation: Inline with schema
Registry: Clean, versioned, no duplicates
Validation: Schema-for-schemas validates all schemas
```
### Components to Build
```
markitect/
├── schema_loader.py # NEW: Load schemas from markdown
├── schema_validator.py # UPDATED: Support .md schemas
├── cli.py # UPDATED: Accept .md schema files
└── schemas/
├── schema-schema-v1.md # NEW: Schema-for-schemas
└── ...versioned schemas...
examples/schemas/ # Markdown schema examples
└── manpage-schema-v1.md # Already created
roadmap/schema-of-schemas/ # Planning artifacts
├── WORKPLAN.md # This file
├── SCHEMA_NAMING_SPEC.md # Naming convention spec
└── IMPLEMENTATION_LOG.md # Progress tracking
```
## Phase Breakdown
### Phase 0: Planning & Setup ✅ (0.5 days)
**Goal:** Establish project structure and specifications
**Tasks:**
- [x] Create roadmap/schema-of-schemas directory
- [x] Move planning documents to roadmap
- [ ] Write naming convention specification
- [ ] Document schema metadata standard
- [ ] Create implementation checklist
**Deliverables:**
- [x] Directory structure
- [ ] SCHEMA_NAMING_SPEC.md
- [ ] SCHEMA_METADATA_SPEC.md
- [ ] This workplan
**Duration:** 0.5 days
**Status:** In Progress
---
### Phase 1: Filename Convention & Validation (1 day)
**Goal:** Establish and enforce filename conventions
**1.1 Define Naming Convention**
**Specification:**
```
Format: {domain}-schema-v{major}.{minor}.md
Components:
- domain: lowercase, hyphen-separated (e.g., "manpage", "api-documentation")
- schema: literal string "schema"
- version: SemVer major.minor (e.g., "v1.0", "v2.1")
- extension: ".md" (markdown)
Examples:
✓ manpage-schema-v1.0.md
✓ terminology-schema-v1.0.md
✓ api-documentation-schema-v1.0.md
✗ manpage.json (missing version)
✗ manpage-v1.md (missing "schema")
✗ ManPage-Schema-v1.0.md (wrong case)
```
**1.2 Implement Validation Function**
**File:** `markitect/schema_naming.py` (NEW)
```python
import re
from pathlib import Path
from typing import Tuple, Optional
SCHEMA_FILENAME_PATTERN = re.compile(
r'^(?P<domain>[a-z][a-z0-9-]*)-schema-v(?P<major>\d+)\.(?P<minor>\d+)\.md$'
)
def validate_schema_filename(filename: str) -> Tuple[bool, Optional[dict]]:
"""
Validate schema filename against convention.
Returns:
(is_valid, metadata_dict)
"""
match = SCHEMA_FILENAME_PATTERN.match(filename)
if not match:
return False, None
return True, {
'domain': match.group('domain'),
'version': f"{match.group('major')}.{match.group('minor')}",
'major': int(match.group('major')),
'minor': int(match.group('minor'))
}
def suggest_schema_filename(domain: str, version: str) -> str:
"""Generate correct schema filename from domain and version."""
# Normalize domain: lowercase, replace spaces with hyphens
domain_clean = domain.lower().replace(' ', '-').replace('_', '-')
return f"{domain_clean}-schema-v{version}.md"
```
**1.3 Add CLI Validation**
**Update:** `markitect/cli.py` - schema-ingest command
```python
@cli.command('schema-ingest')
@click.argument('schema_file', type=click.Path(exists=True, path_type=Path))
@click.option('--force', is_flag=True, help='Skip filename validation')
def schema_ingest(config, schema_file, force):
"""Ingest schema file with filename validation."""
from .schema_naming import validate_schema_filename, suggest_schema_filename
filename = schema_file.name
is_valid, metadata = validate_schema_filename(filename)
if not is_valid and not force:
click.echo(f"❌ Invalid schema filename: {filename}", err=True)
click.echo("\nExpected format: {domain}-schema-v{major}.{minor}.md")
click.echo("Example: manpage-schema-v1.0.md")
# Try to suggest correct name
# ... extract domain/version from file content ...
suggestion = suggest_schema_filename(domain, version)
click.echo(f"\nSuggested filename: {suggestion}")
click.echo("\nUse --force to skip validation")
sys.exit(1)
# Continue with ingestion...
```
**Tasks:**
- [ ] Write `markitect/schema_naming.py`
- [ ] Add unit tests for filename validation
- [ ] Update `schema-ingest` command with validation
- [ ] Test with valid and invalid filenames
**Deliverables:**
- [ ] schema_naming.py with validation logic
- [ ] Unit tests (tests/test_schema_naming.py)
- [ ] Updated CLI with validation
- [ ] SCHEMA_NAMING_SPEC.md documentation
**Duration:** 1 day
---
### Phase 2: Markdown Schema Loader (2-3 days)
**Goal:** Parse markdown files to extract JSON schemas
**2.1 Design Markdown Schema Format**
**Format Specification:**
```markdown
---
schema-id: "https://markitect.dev/schemas/{domain}/v{major}"
version: "{major}.{minor}.{patch}"
status: "stable|draft|deprecated"
---
# {Title} v{version}
## Overview
[Human-readable description]
## Usage
[Examples of how to use this schema]
## Schema Definition
```json
{
"$schema": "http://json-schema.org/draft-07/schema#",
"$id": "https://markitect.dev/schemas/{domain}/v{major}",
"version": "{major}.{minor}.{patch}",
...
}
\```
## Validation Rules
[Explanation of schema rules]
## Version History
[Changelog]
```
**2.2 Implement Markdown Schema Loader**
**File:** `markitect/schema_loader.py` (NEW)
```python
"""
Schema Loader - Extract JSON schemas from markdown files.
Supports:
- YAML frontmatter for metadata
- JSON code block for schema definition
- Validation of schema structure
"""
import re
import json
import yaml
from pathlib import Path
from typing import Dict, Any, Optional, Tuple
class MarkdownSchemaLoader:
"""Load and parse markdown schema files."""
def __init__(self):
self.frontmatter_pattern = re.compile(
r'^---\s*\n(.*?)\n---\s*\n',
re.DOTALL | re.MULTILINE
)
self.json_code_block_pattern = re.compile(
r'```json\s*\n(.*?)\n```',
re.DOTALL | re.MULTILINE
)
def load_schema(self, md_path: Path) -> Dict[str, Any]:
"""
Load schema from markdown file.
Returns:
{
'schema': {...}, # Extracted JSON schema
'metadata': {...}, # Frontmatter metadata
'documentation': '...' # Full markdown content
}
"""
if not md_path.exists():
raise FileNotFoundError(f"Schema file not found: {md_path}")
content = md_path.read_text(encoding='utf-8')
# Extract frontmatter
metadata = self._extract_frontmatter(content)
# Extract JSON schema
schema = self._extract_json_schema(content)
if not schema:
raise ValueError(f"No JSON schema found in {md_path}")
# Merge metadata into schema
schema = self._merge_metadata(schema, metadata, md_path)
return {
'schema': schema,
'metadata': metadata,
'documentation': content,
'source_file': str(md_path)
}
def _extract_frontmatter(self, content: str) -> Dict[str, Any]:
"""Extract YAML frontmatter from markdown."""
match = self.frontmatter_pattern.search(content)
if not match:
return {}
try:
return yaml.safe_load(match.group(1)) or {}
except yaml.YAMLError as e:
raise ValueError(f"Invalid YAML frontmatter: {e}")
def _extract_json_schema(self, content: str) -> Optional[Dict[str, Any]]:
"""Extract JSON schema from code block."""
matches = self.json_code_block_pattern.findall(content)
if not matches:
return None
# Use the first JSON code block as schema
# (or could look for specific heading like "## Schema Definition")
try:
return json.loads(matches[0])
except json.JSONDecodeError as e:
raise ValueError(f"Invalid JSON schema: {e}")
def _merge_metadata(
self,
schema: Dict[str, Any],
metadata: Dict[str, Any],
source_file: Path
) -> Dict[str, Any]:
"""Merge frontmatter metadata into schema."""
# Add MarkiTect-specific metadata
schema['x-markitect-source'] = {
'file': str(source_file),
'format': 'markdown',
'frontmatter': metadata
}
# Override schema fields with frontmatter if present
if 'version' in metadata:
schema['version'] = metadata['version']
if 'schema-id' in metadata:
schema['$id'] = metadata['schema-id']
return schema
def save_schema(
self,
schema: Dict[str, Any],
md_path: Path,
template: Optional[str] = None
):
"""
Save schema as markdown file.
Args:
schema: JSON schema dict
md_path: Output path
template: Optional markdown template
"""
if template:
# Use provided template
content = self._render_template(template, schema)
else:
# Generate basic markdown
content = self._generate_markdown(schema)
md_path.write_text(content, encoding='utf-8')
def _generate_markdown(self, schema: Dict[str, Any]) -> str:
"""Generate markdown from schema."""
title = schema.get('title', 'Untitled Schema')
version = schema.get('version', '1.0.0')
description = schema.get('description', '')
# Generate frontmatter
frontmatter = yaml.dump({
'schema-id': schema.get('$id', ''),
'version': version,
'status': 'draft'
}, default_flow_style=False)
# Generate markdown
md = f"""---
{frontmatter}---
# {title} v{version}
## Overview
{description}
## Schema Definition
```json
{json.dumps(schema, indent=2)}
```
## Version History
### v{version}
- Initial version
"""
return md
class SchemaLoaderError(Exception):
"""Base exception for schema loading errors."""
pass
```
**2.3 Update Schema Validator**
**Update:** `markitect/schema_validator.py`
```python
from .schema_loader import MarkdownSchemaLoader
class SchemaValidator:
def __init__(self):
self.schema_generator = SchemaGenerator()
self.jsonschema_available = JSONSCHEMA_AVAILABLE
self.md_loader = MarkdownSchemaLoader() # NEW
def validate_file_against_schema_file(
self,
file_path: Path,
schema_file_path: Path
) -> bool:
"""Validate file against schema (supports .json and .md)."""
# Detect schema file format
if schema_file_path.suffix == '.md':
# Load from markdown
schema_data = self.md_loader.load_schema(schema_file_path)
schema = schema_data['schema']
else:
# Load from JSON (legacy)
schema_content = schema_file_path.read_text(encoding='utf-8')
schema = json.loads(schema_content)
return self.validate_file_against_schema(file_path, schema)
```
**Tasks:**
- [ ] Implement MarkdownSchemaLoader class
- [ ] Add frontmatter extraction (YAML)
- [ ] Add JSON code block extraction
- [ ] Add metadata merging logic
- [ ] Write comprehensive unit tests
- [ ] Update SchemaValidator to use loader
- [ ] Test with example markdown schemas
**Deliverables:**
- [ ] schema_loader.py implementation
- [ ] Unit tests (tests/test_schema_loader.py)
- [ ] Updated schema_validator.py
- [ ] Integration tests
**Duration:** 2-3 days
---
### Phase 3: Schema-for-Schemas (2 days)
**Goal:** Create metaschema to validate all schema files
**3.1 Design Schema-for-Schemas**
**File:** `markitect/schemas/schema-schema-v1.md`
**Purpose:** Validates that schema files follow MarkiTect conventions
**Validates:**
- Required fields ($schema, $id, version, title, description)
- Version format (SemVer)
- $id URL format
- x-markitect-* extensions
- Section classifications
- Content control structures
**3.2 Implement Schema-for-Schemas**
See separate file: `roadmap/schema-of-schemas/schema-schema-v1.md` (to be created)
**3.3 Add Schema Validation Command**
**New CLI command:** `markitect schema-validate`
```python
@cli.command('schema-validate')
@click.argument('schema_file', type=click.Path(exists=True, path_type=Path))
@click.option('--detailed-errors', is_flag=True)
def schema_validate(config, schema_file, detailed_errors):
"""
Validate a schema file against the schema-for-schemas.
Ensures schema files follow MarkiTect conventions and standards.
"""
from .schema_loader import MarkdownSchemaLoader
from .schema_validator import SchemaValidator
loader = MarkdownSchemaLoader()
validator = SchemaValidator()
# Load the schema
try:
schema_data = loader.load_schema(schema_file)
schema = schema_data['schema']
except Exception as e:
click.echo(f"❌ Failed to load schema: {e}", err=True)
sys.exit(1)
# Load schema-for-schemas
metaschema_path = Path(__file__).parent / 'schemas' / 'schema-schema-v1.md'
metaschema_data = loader.load_schema(metaschema_path)
metaschema = metaschema_data['schema']
# Validate
is_valid = validator.validate_schema_against_metaschema(schema, metaschema)
if is_valid:
click.echo(f"✅ Schema is valid: {schema_file.name}")
click.echo(f" Title: {schema.get('title')}")
click.echo(f" Version: {schema.get('version')}")
else:
click.echo(f"❌ Schema validation failed: {schema_file.name}", err=True)
if detailed_errors:
# Show detailed errors
pass
sys.exit(1)
```
**Tasks:**
- [ ] Design schema-for-schemas structure
- [ ] Implement schema-schema-v1.md
- [ ] Add schema validation logic
- [ ] Create `schema-validate` CLI command
- [ ] Test all existing schemas against metaschema
- [ ] Document validation rules
**Deliverables:**
- [ ] schema-schema-v1.md (metaschema)
- [ ] schema-validate command
- [ ] Validation documentation
- [ ] Test suite
**Duration:** 2 days
---
### Phase 4: Schema Migration (1-2 days)
**Goal:** Convert existing schemas to new format
**4.1 Inventory Current Schemas**
Current schemas in database:
```
1. terminology-schema.json → terminology-schema-v1.0.md
2. api-documentation → api-documentation-schema-v1.0.md
3. enhanced-manpage → manpage-schema-v2.0.md
4. markdown-manpage → manpage-schema-v1.0.md (DUPLICATE)
5. markdown-manpage-schema.json → manpage-schema-v1.0.md (DUPLICATE)
```
**Decision matrix:**
- Keep enhanced-manpage as v2.0 (has classifications)
- Merge markdown-manpage variants into v1.0
- Update terminology to v1.0
- Update api-documentation to v1.0
**4.2 Create Migration Script**
**File:** `scripts/migrate_schemas.py`
```python
#!/usr/bin/env python3
"""Migrate schemas to markdown format with versioning."""
from pathlib import Path
from markitect.schema_loader import MarkdownSchemaLoader
from markitect.database import DatabaseManager
def migrate_schema(
db_manager: DatabaseManager,
old_name: str,
new_name: str,
version: str,
domain: str
):
"""Migrate single schema to new format."""
# Get old schema from database
old_schema = db_manager.get_schema_file(old_name)
if not old_schema:
print(f"❌ Schema not found: {old_name}")
return
schema_json = json.loads(old_schema['schema_content'])
# Update metadata
schema_json['version'] = version
schema_json['$id'] = f"https://markitect.dev/schemas/{domain}/v{version.split('.')[0]}"
# Save as markdown
loader = MarkdownSchemaLoader()
md_path = Path(f"markitect/schemas/{new_name}")
loader.save_schema(schema_json, md_path)
print(f"✓ Migrated: {old_name}{new_name}")
# Ingest new schema
# ... ingest markdown schema to database ...
return md_path
def main():
migrations = [
('terminology-schema.json', 'terminology-schema-v1.0.md', '1.0.0', 'terminology'),
('api-documentation', 'api-documentation-schema-v1.0.md', '1.0.0', 'api-documentation'),
('enhanced-manpage', 'manpage-schema-v2.0.md', '2.0.0', 'manpage'),
('markdown-manpage', 'manpage-schema-v1.0.md', '1.0.0', 'manpage'),
]
db = DatabaseManager('markitect.db')
for old, new, version, domain in migrations:
migrate_schema(db, old, new, version, domain)
```
**4.3 Execute Migration**
```bash
# Run migration script
python scripts/migrate_schemas.py
# Validate all new schemas
for schema in markitect/schemas/*-schema-v*.md; do
markitect schema-validate "$schema"
done
# Ingest new schemas
for schema in markitect/schemas/*-schema-v*.md; do
markitect schema-ingest "$schema"
done
```
**4.4 Clean Up Registry**
```bash
# Remove old schemas from database
markitect schema-delete markdown-manpage
markitect schema-delete markdown-manpage-schema.json
markitect schema-delete api-documentation
markitect schema-delete enhanced-manpage
markitect schema-delete terminology-schema.json
# Verify cleanup
markitect schema-list
# Should show only versioned .md schemas
```
**Tasks:**
- [ ] Create schema inventory
- [ ] Write migration script
- [ ] Test migration on one schema
- [ ] Execute full migration
- [ ] Validate all migrated schemas
- [ ] Remove old schemas from database
- [ ] Update examples to use new schema names
**Deliverables:**
- [ ] scripts/migrate_schemas.py
- [ ] All schemas migrated to markdown
- [ ] Clean registry (no duplicates)
- [ ] Migration report
**Duration:** 1-2 days
---
### Phase 5: CLI & Documentation Updates (1 day)
**Goal:** Update CLI and documentation for new system
**5.1 Update CLI Commands**
Commands to update:
- `schema-ingest` - Accept .md files, validate filename
- `schema-list` - Show version in output
- `schema-get` - Export as .md or .json
- `validate` - Accept .md schema files
- `generate-stub` - Work with .md schemas
- `schema-generate` - Output .md format option
- NEW: `schema-validate` - Validate against metaschema
**5.2 Update Documentation**
Files to update:
- README.md - Mention markdown schemas
- examples/terminology/README.md - Use new schema name
- docs/specifications/schema-extensions-spec.md - Document markdown format
- Create: docs/guides/schema-authoring-guide.md
**5.3 Add Schema Templates**
**File:** `templates/schema-template-v1.md`
```markdown
---
schema-id: "https://markitect.dev/schemas/DOMAIN/v1"
version: "1.0.0"
status: "draft"
---
# TITLE Schema v1.0
## Overview
[Description of what this schema validates]
## Document Types
- [Document type 1]
- [Document type 2]
## Usage
\`\`\`bash
markitect validate document.md --schema DOMAIN-schema-v1.0.md
\`\`\`
## Examples
See [examples/DOMAIN/example.md](../../examples/DOMAIN/example.md)
## Schema Definition
\`\`\`json
{
"$schema": "http://json-schema.org/draft-07/schema#",
"$id": "https://markitect.dev/schemas/DOMAIN/v1",
"version": "1.0.0",
"title": "TITLE Schema",
"description": "Schema for validating DESCRIPTION",
"type": "object",
"properties": {
"headings": {
"type": "object"
}
},
"x-markitect-sections": {},
"x-markitect-content-control": {}
}
\`\`\`
## Validation Rules
### Required Sections
- **SECTION** - Description
### Optional Sections
- **SECTION** - Description
## Version History
### v1.0.0 (YYYY-MM-DD)
- Initial release
```
**Tasks:**
- [ ] Update all CLI commands for .md support
- [ ] Update documentation
- [ ] Create schema authoring guide
- [ ] Add schema template
- [ ] Update examples
- [ ] Test all workflows end-to-end
**Deliverables:**
- [ ] Updated CLI commands
- [ ] Schema authoring guide
- [ ] Schema template
- [ ] Updated examples
- [ ] End-to-end tests
**Duration:** 1 day
---
### Phase 6: Testing & Validation (1 day)
**Goal:** Comprehensive testing of new system
**6.1 Unit Tests**
Test coverage for:
- `schema_naming.py` - Filename validation
- `schema_loader.py` - Markdown parsing
- `schema_validator.py` - Validation with .md schemas
**6.2 Integration Tests**
End-to-end workflows:
1. Create new schema in markdown format
2. Validate schema against schema-for-schemas
3. Ingest schema to database
4. Use schema to validate documents
5. Generate stub from schema
6. Export schema
**6.3 Regression Tests**
Ensure existing functionality still works:
- JSON schemas still load (backward compatibility)
- All existing documents validate
- Schema generation still works
- Stub generation still works
**Tasks:**
- [ ] Write unit tests for new modules
- [ ] Create integration test suite
- [ ] Run regression tests
- [ ] Fix any issues found
- [ ] Achieve >80% code coverage
- [ ] Document test procedures
**Deliverables:**
- [ ] Unit tests (>80% coverage)
- [ ] Integration tests
- [ ] Regression test suite
- [ ] Test documentation
**Duration:** 1 day
---
## Timeline
```
Week 1:
Day 1: Phase 0 (Planning) + Phase 1 (Naming Convention)
Day 2-3: Phase 2 (Markdown Loader)
Day 4-5: Phase 3 (Schema-for-Schemas)
Week 2:
Day 6-7: Phase 4 (Migration)
Day 8: Phase 5 (CLI & Docs)
Day 9: Phase 6 (Testing)
Day 10: Buffer for issues/refinement
```
**Total:** 8-10 days
## Risks & Mitigation
### Risk 1: Parsing Complexity
**Risk:** Markdown parsing more complex than expected
**Probability:** Medium
**Impact:** High
**Mitigation:**
- Start with simple regex-based parser
- Test extensively with edge cases
- Have fallback to simpler format
### Risk 2: Backward Compatibility
**Risk:** Breaking existing workflows
**Probability:** Low
**Impact:** High
**Mitigation:**
- Support both .json and .md during transition
- Provide migration script
- Test thoroughly with existing documents
### Risk 3: Schema-for-Schemas Complexity
**Risk:** Self-referential validation complex
**Probability:** Medium
**Impact:** Medium
**Mitigation:**
- Start with simple metaschema
- Iterate based on actual schemas
- Don't over-engineer initially
## Success Metrics
- [ ] All schemas follow naming convention (5/5)
- [ ] All schemas in markdown format (5/5)
- [ ] All schemas validate against metaschema (5/5)
- [ ] Zero duplicate schemas in registry
- [ ] CLI commands work with .md schemas
- [ ] Documentation comprehensive
- [ ] Test coverage >80%
- [ ] No regression in existing functionality
## Deliverables Checklist
### Code
- [ ] markitect/schema_naming.py
- [ ] markitect/schema_loader.py
- [ ] markitect/schemas/schema-schema-v1.md
- [ ] scripts/migrate_schemas.py
- [ ] Updated CLI commands
- [ ] Unit tests
- [ ] Integration tests
### Documentation
- [ ] SCHEMA_NAMING_SPEC.md
- [ ] SCHEMA_METADATA_SPEC.md
- [ ] Schema authoring guide
- [ ] Migration guide
- [ ] Updated examples
- [ ] IMPLEMENTATION_LOG.md
### Schemas
- [ ] terminology-schema-v1.0.md
- [ ] api-documentation-schema-v1.0.md
- [ ] manpage-schema-v1.0.md
- [ ] manpage-schema-v2.0.md
- [ ] schema-schema-v1.0.md
### Registry
- [ ] Clean schema database
- [ ] Updated schema-catalog.yaml
- [ ] No duplicates
## Next Steps
1. **Review this workplan** - Get approval
2. **Phase 0** - Complete planning artifacts
3. **Phase 1** - Implement naming validation
4. **Checkpoint** - Review progress after Phase 1
5. **Continue** - Execute remaining phases
## Approval
- [ ] Workplan reviewed
- [ ] Approach approved
- [ ] Ready to begin implementation
---
**Status:** Awaiting approval
**Next Action:** Complete Phase 0 planning artifacts