From b6f95066a357a2d91d123316ef829f854b5821fc Mon Sep 17 00:00:00 2001 From: tegwick Date: Sun, 4 Jan 2026 23:47:02 +0100 Subject: [PATCH] chore: establish schema-of-schemas workplan and reorganize roadmap MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit This commit sets up the comprehensive workplan for implementing a markdown-first schema management system with naming conventions, versioning, and self-validation capabilities. ## Directory Reorganization - Renamed `todo/` → `roadmap/` for better organization - Created `roadmap/schema-of-schemas/` subdirectory - Moved schema management planning artifacts to dedicated directory ## Planning Artifacts Created ### Workplan & Documentation - **WORKPLAN.md** (19KB) - Comprehensive 6-phase implementation plan - **SCHEMA_MANAGEMENT_PROPOSAL.md** - Full analysis with 4 options - **SCHEMA_MANAGEMENT_SUMMARY.md** - Executive summary - **README.md** - Quick reference guide ### Example Schema - **examples/schemas/manpage-schema-v1.md** - Demonstrates markdown format ## Schema Management System Design ### Naming Convention **Format:** `{domain}-schema-v{major}.{minor}.md` **Examples:** - `manpage-schema-v1.0.md` - `terminology-schema-v1.0.md` - `api-documentation-schema-v1.0.md` ### Markdown-First Format Schemas will be markdown files with: - YAML frontmatter for metadata - Rich documentation sections - Embedded JSON schema in code block - Version history and examples ### Implementation Phases (8-10 days) **Phase 0:** Planning & Setup ✅ (0.5 days) - COMPLETE **Phase 1:** Filename Convention (1 day) - NEXT **Phase 2:** Markdown Loader (2-3 days) **Phase 3:** Schema-for-Schemas (2 days) **Phase 4:** Schema Migration (1-2 days) **Phase 5:** CLI & Documentation (1 day) **Phase 6:** Testing & Validation (1 day) ### Goals 1. ✅ Establish naming convention 2. ⏳ Implement filename validation 3. ⏳ Create markdown schema loader 4. ⏳ Build schema-for-schemas metaschema 5. ⏳ Migrate 5 existing schemas (remove 2 duplicates) 6. ⏳ Update CLI and documentation ## Updated Tracking ### TODO.md - Added Schema-of-Schemas as active work item - Documented Phase 1 tasks and timeline - Paused capability extraction work ### CHANGELOG.md - Added schema management system to [Unreleased] - Documented directory reorganization - Added "In Progress" section for current work ## Next Steps Begin Phase 1: 1. Implement schema_naming.py with validation 2. Add unit tests 3. Update CLI schema-ingest command 4. Create naming specification document ## Files Changed - CHANGELOG.md - Added unreleased schema management features - TODO.md - Updated active work tracking - roadmap/ - Reorganized from todo/ - roadmap/schema-of-schemas/ - New planning directory - examples/schemas/ - Example markdown schema 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 --- CHANGELOG.md | 20 + TODO.md | 32 +- examples/schemas/manpage-schema-v1.md | 309 ++++++ .../ISSUE_DATAMODEL_OPTIMIZATION_GAMEPLAN.md | 0 {todo => roadmap}/RelevantClaudeIssues.md | 0 .../SCHEMA_EVOLUTION_WORKPLAN.md | 0 roadmap/schema-of-schemas/README.md | 94 ++ .../SCHEMA_MANAGEMENT_PROPOSAL.md | 569 +++++++++++ .../SCHEMA_MANAGEMENT_SUMMARY.md | 154 +++ roadmap/schema-of-schemas/WORKPLAN.md | 962 ++++++++++++++++++ 10 files changed, 2138 insertions(+), 2 deletions(-) create mode 100644 examples/schemas/manpage-schema-v1.md rename {todo => roadmap}/ISSUE_DATAMODEL_OPTIMIZATION_GAMEPLAN.md (100%) rename {todo => roadmap}/RelevantClaudeIssues.md (100%) rename {todo => roadmap}/SCHEMA_EVOLUTION_WORKPLAN.md (100%) create mode 100644 roadmap/schema-of-schemas/README.md create mode 100644 roadmap/schema-of-schemas/SCHEMA_MANAGEMENT_PROPOSAL.md create mode 100644 roadmap/schema-of-schemas/SCHEMA_MANAGEMENT_SUMMARY.md create mode 100644 roadmap/schema-of-schemas/WORKPLAN.md diff --git a/CHANGELOG.md b/CHANGELOG.md index 9d082521..47aeb64c 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -8,9 +8,22 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 ## [Unreleased] ### Added +- **Schema Management System**: Comprehensive schema management infrastructure with naming conventions and versioning + - Naming convention: `{domain}-schema-v{major}.{minor}.md` for all schemas + - Markdown-first schema format with embedded JSON (documentation + schema in one file) + - Schema catalog (`markitect/schemas/schema-catalog.yaml`) for metadata and discovery + - Terminology validation example (`examples/terminology/`) demonstrating schema usage beyond manpages + - Schema-for-schemas workplan in `roadmap/schema-of-schemas/` directory +- **Enhanced schema-list Command**: Now displays creation timestamps in all output formats + - Default format shows timestamps inline: `schema-name.json (added: 2026-01-04T23:01:19)` + - Table format includes Created/Updated columns + - Cleaner timestamp formatting (removed microseconds) - Enhanced control panel UI with better resize handle positioning for improved user interaction ### Changed +- **Directory Reorganization**: Renamed `todo/` → `roadmap/` for better organization of planning documents + - Created `roadmap/schema-of-schemas/` subdirectory for schema management planning artifacts + - Moved schema management proposals and workplan to dedicated directory - Refactored contents control architecture to use base class pattern properly for better code organization - Updated all file references and paths to point to single source of truth in capabilities/testdrive-jsui/js/controls/ directory @@ -21,6 +34,13 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 ### Removed - **BREAKING**: Legacy DocumentControls component from TestDrive JSUI plugin system - all control panel functionality now provided by enhanced control panels (ContentsControl, StatusControl, DebugControl, EditControl) with Reset All button functionality moved to EditControl for better maintainability and elimination of code duplication +### In Progress +- **Schema-of-Schemas Implementation** (Phase 1 of 6) + - Implementing filename validation for schema naming convention + - Building markdown schema loader to parse `.md` schema files + - Creating schema-for-schemas metaschema for schema validation + - Planning migration of 5 existing schemas to new format (will remove 2 duplicates) + ## [0.9.0] - 2025-11-14 ### Added diff --git a/TODO.md b/TODO.md index 41ba7b0b..9b282fef 100644 --- a/TODO.md +++ b/TODO.md @@ -12,9 +12,37 @@ The structure organizes **future tasks** by their impact, just as a changelog or This section is for tasks currently being discussed with or worked on by the coding assistant. These are the ephemeral, flow-of-thought tasks. -0. the file TODO.html is legacy i think and can be removed +### Schema-of-Schemas Implementation (Active - Phase 1) -### Extract Capability-Capability from Issue-Facade +**Status:** Phase 1 - Filename Convention & Validation (In Progress) +**Workplan:** See `roadmap/schema-of-schemas/WORKPLAN.md` + +**Current Goals:** +1. ✅ Establish naming convention: `{domain}-schema-v{major}.{minor}.md` +2. 🔄 Implement filename validation logic +3. 🔄 Update CLI with validation +4. ⏳ Create markdown schema loader +5. ⏳ Build schema-for-schemas metaschema +6. ⏳ Migrate existing schemas to new format + +**Phase 1 Tasks (Current):** +- [ ] Write `markitect/schema_naming.py` with validation logic +- [ ] Add unit tests for filename validation +- [ ] Update `schema-ingest` command with validation +- [ ] Create SCHEMA_NAMING_SPEC.md documentation + +**Next Phases:** +- Phase 2: Markdown Schema Loader (2-3 days) +- Phase 3: Schema-for-Schemas Metaschema (2 days) +- Phase 4: Schema Migration (1-2 days) +- Phase 5: CLI & Documentation Updates (1 day) +- Phase 6: Testing & Validation (1 day) + +**Expected Completion:** 8-10 days total + +--- + +### Extract Capability-Capability from Issue-Facade (Paused) **Context:** Issue-facade currently provides two capabilities: 1. **issue-tracking** (explicit in CAPABILITY-issue-tracking.yaml) - Issue management across platforms diff --git a/examples/schemas/manpage-schema-v1.md b/examples/schemas/manpage-schema-v1.md new file mode 100644 index 00000000..b740f02f --- /dev/null +++ b/examples/schemas/manpage-schema-v1.md @@ -0,0 +1,309 @@ +# Manpage Schema v1.0 + +**Schema ID:** `https://markitect.dev/schemas/manpage/v1` +**Version:** 1.0.0 +**Status:** Stable +**Created:** 2026-01-04 +**Document Types:** Manual pages, CLI documentation + +## Overview + +This schema validates Unix/Linux manual page documentation following standard conventions. Manual pages (man pages) are the traditional form of documentation for Unix commands and system calls. + +## Typical Structure + +A well-formed manual page includes: + +1. **NAME** - Command name and one-line description +2. **SYNOPSIS** - Command syntax showing all options +3. **DESCRIPTION** - Detailed explanation of what the command does +4. **OPTIONS** - Description of each command-line option +5. **EXAMPLES** - Practical usage examples +6. **SEE ALSO** - Related commands or documentation + +## Usage + +### Validate a Manpage + +```bash +markitect validate mycommand.1.md --schema manpage-schema-v1 +``` + +### Generate Manpage Template + +```bash +markitect generate-stub manpage-schema-v1.json --output newcommand.1.md +``` + +### Refine Existing Schema + +```bash +markitect schema-refine manpage-schema-v1.json --loosen-counts +``` + +## Examples + +Complete examples can be found in: +- [examples/manpages/markdown-schema-validation.1.md](../manpages/markdown-schema-validation.1.md) + +## Validation Rules + +### Required Sections (Level 2 Headings) + +**NAME** +- **Classification:** Required +- **Content:** Command name in bold followed by brief description +- **Format:** `**command** - one line description` +- **Example:** `**markitect** - validate and analyze markdown documents` + +**SYNOPSIS** +- **Classification:** Required +- **Content:** Command syntax with all options +- **Min code blocks:** 1 +- **Max paragraphs:** 3 +- **Example:** + ```bash + markitect [OPTIONS] COMMAND [ARGS] + ``` + +**DESCRIPTION** +- **Classification:** Required +- **Content:** Detailed explanation of the command +- **Min paragraphs:** 2 +- **Min words:** 50 + +### Recommended Sections + +**OPTIONS** +- **Classification:** Recommended +- **Content:** Definition list or table of command-line options +- **Format:** Each option as `**--option** *type*` followed by description + +**EXAMPLES** +- **Classification:** Recommended +- **Content:** Practical usage examples with explanations +- **Min code blocks:** 1 +- **Benefit:** Greatly improves documentation usability + +### Optional Sections + +**ENVIRONMENT** +- Environment variables affecting command behavior + +**FILES** +- Important files used or created by the command + +**EXIT STATUS** +- Explanation of exit codes + +**BUGS** +- Known issues or limitations + +**AUTHORS** +- Command authors and contributors + +**SEE ALSO** +- Related commands or documentation + +## Schema Definition + +```json +{ + "$schema": "http://json-schema.org/draft-07/schema#", + "$id": "https://markitect.dev/schemas/manpage/v1", + "version": "1.0.0", + "title": "Unix Manual Page Schema", + "description": "Schema for validating Unix/Linux manual page documentation", + "type": "object", + "properties": { + "headings": { + "type": "object", + "properties": { + "level_1": { + "type": "array", + "description": "Document title (command name)", + "items": { + "type": "object", + "properties": { + "content": { + "type": "string", + "pattern": ".*" + } + } + }, + "minItems": 1, + "maxItems": 1 + }, + "level_2": { + "type": "array", + "description": "Section headings", + "items": { + "type": "object", + "properties": { + "content": { + "type": "string", + "enum": [ + "NAME", + "SYNOPSIS", + "DESCRIPTION", + "OPTIONS", + "EXAMPLES", + "ENVIRONMENT", + "FILES", + "EXIT STATUS", + "BUGS", + "AUTHORS", + "SEE ALSO" + ] + } + } + }, + "minItems": 3, + "maxItems": 20 + } + }, + "required": ["level_1", "level_2"] + }, + "paragraphs": { + "type": "array", + "description": "Paragraph content", + "items": { + "type": "object", + "properties": { + "content": { + "type": "string", + "minLength": 1 + } + } + }, + "minItems": 3 + }, + "code_blocks": { + "type": "array", + "description": "Code examples and command syntax", + "items": { + "type": "object", + "properties": { + "language": { + "type": "string" + }, + "content": { + "type": "string" + } + } + }, + "minItems": 1 + } + }, + "required": ["headings", "paragraphs"], + "x-markitect-sections": { + "NAME": { + "classification": "required", + "heading_level": 2, + "content_instruction": "Command name in bold followed by brief description", + "pattern": "\\*\\*[a-z-]+\\*\\*.*", + "error_if_missing": "NAME section is required in all manual pages" + }, + "SYNOPSIS": { + "classification": "required", + "heading_level": 2, + "content_instruction": "Command syntax showing all options and arguments", + "min_code_blocks": 1, + "error_if_missing": "SYNOPSIS section is required to show command usage" + }, + "DESCRIPTION": { + "classification": "required", + "heading_level": 2, + "content_instruction": "Detailed explanation of what the command does and when to use it", + "min_paragraphs": 2, + "min_words": 50, + "error_if_missing": "DESCRIPTION section is required for detailed explanation" + }, + "OPTIONS": { + "classification": "recommended", + "heading_level": 2, + "content_instruction": "Description of each command-line option with syntax and explanation", + "warning_if_missing": "OPTIONS section recommended for commands with options" + }, + "EXAMPLES": { + "classification": "recommended", + "heading_level": 2, + "content_instruction": "Practical usage examples with explanations", + "min_code_blocks": 1, + "warning_if_missing": "EXAMPLES greatly improve documentation usability" + }, + "ENVIRONMENT": { + "classification": "optional", + "heading_level": 2, + "content_instruction": "Environment variables that affect command behavior" + }, + "FILES": { + "classification": "optional", + "heading_level": 2, + "content_instruction": "Important files used or created by the command" + }, + "SEE ALSO": { + "classification": "optional", + "heading_level": 2, + "content_instruction": "Related commands or documentation references" + } + }, + "x-markitect-content-control": { + "name_section": { + "required_pattern": "\\*\\*[a-z-]+\\*\\*", + "description": "Command name must be in bold", + "example": "**markitect** - validate and analyze markdown documents" + }, + "synopsis_section": { + "min_code_blocks": 1, + "code_block_language": "bash", + "description": "Must include at least one code block showing command syntax" + }, + "description_section": { + "min_paragraphs": 2, + "min_words": 50, + "description": "Detailed description with at least 50 words across 2+ paragraphs" + } + }, + "x-markitect-metadata": { + "schema-type": "document-schema", + "domain": "manpage", + "document-types": ["manual-page", "man-page", "cli-documentation"], + "version": "1.0.0", + "author": "MarkiTect Project", + "created": "2026-01-04", + "updated": "2026-01-04", + "example": "examples/manpages/markdown-schema-validation.1.md", + "supersedes": null, + "superseded-by": null + } +} +``` + +## Version History + +### v1.0.0 (2026-01-04) +- Initial stable release +- Validates standard manpage sections +- Classification system (required/recommended/optional) +- Content control for NAME, SYNOPSIS, DESCRIPTION +- MarkiTect extensions for improved validation + +## Future Enhancements + +Planned for v2.0: +- Multi-language support (internationalization) +- Extended sections (DIAGNOSTICS, SECURITY, STANDARDS) +- Cross-reference validation +- Version-specific section variants (man1, man5, man8) + +## Contributing + +To suggest improvements to this schema: +1. Open an issue describing the enhancement +2. Provide example documents demonstrating the need +3. Propose specific schema changes + +## License + +This schema is part of the MarkiTect project and follows the same license. diff --git a/todo/ISSUE_DATAMODEL_OPTIMIZATION_GAMEPLAN.md b/roadmap/ISSUE_DATAMODEL_OPTIMIZATION_GAMEPLAN.md similarity index 100% rename from todo/ISSUE_DATAMODEL_OPTIMIZATION_GAMEPLAN.md rename to roadmap/ISSUE_DATAMODEL_OPTIMIZATION_GAMEPLAN.md diff --git a/todo/RelevantClaudeIssues.md b/roadmap/RelevantClaudeIssues.md similarity index 100% rename from todo/RelevantClaudeIssues.md rename to roadmap/RelevantClaudeIssues.md diff --git a/todo/SCHEMA_EVOLUTION_WORKPLAN.md b/roadmap/SCHEMA_EVOLUTION_WORKPLAN.md similarity index 100% rename from todo/SCHEMA_EVOLUTION_WORKPLAN.md rename to roadmap/SCHEMA_EVOLUTION_WORKPLAN.md diff --git a/roadmap/schema-of-schemas/README.md b/roadmap/schema-of-schemas/README.md new file mode 100644 index 00000000..a2fcbfc4 --- /dev/null +++ b/roadmap/schema-of-schemas/README.md @@ -0,0 +1,94 @@ +# Schema-of-Schemas Implementation + +**Project:** Markdown-First Schema System +**Status:** Planning → Implementation +**Timeline:** 8-10 days + +## Quick Links + +- **[WORKPLAN.md](./WORKPLAN.md)** - Detailed implementation plan with phases +- **[SCHEMA_MANAGEMENT_PROPOSAL.md](./SCHEMA_MANAGEMENT_PROPOSAL.md)** - Full analysis and options +- **[SCHEMA_MANAGEMENT_SUMMARY.md](./SCHEMA_MANAGEMENT_SUMMARY.md)** - Executive summary + +## What We're Building + +### Goals +1. ✅ Filename convention: `{domain}-schema-v{version}.md` +2. ✅ Markdown-first schema format (documentation + embedded JSON) +3. ✅ Schema-for-schemas to validate all schemas +4. ✅ Migrate existing schemas to new format +5. ✅ Clean up duplicate/legacy schemas + +### Why +- **Consistency:** Enforced naming and versioning +- **Alignment:** Markdown-first matches MarkiTect philosophy +- **Documentation:** Rich docs alongside schemas +- **Validation:** Schema-for-schemas ensures quality +- **Maintainability:** Clear versions and structure + +## Implementation Phases + +1. **Phase 0:** Planning & Setup (0.5 days) ← **Current** +2. **Phase 1:** Filename Convention (1 day) +3. **Phase 2:** Markdown Loader (2-3 days) +4. **Phase 3:** Schema-for-Schemas (2 days) +5. **Phase 4:** Schema Migration (1-2 days) +6. **Phase 5:** CLI & Docs (1 day) +7. **Phase 6:** Testing (1 day) + +## Current Status + +### Completed +- [x] Directory structure created +- [x] Planning documents moved to roadmap +- [x] Comprehensive workplan written +- [x] Example markdown schema created + +### Next Steps +1. Complete Phase 0 planning artifacts +2. Begin Phase 1 implementation +3. Checkpoint review after Phase 1 + +## Key Decisions + +### Naming Convention +**Format:** `{domain}-schema-v{major}.{minor}.md` +**Example:** `manpage-schema-v1.0.md` + +### Schema Format +**Markdown with embedded JSON:** +```markdown +--- +schema-id: "https://markitect.dev/schemas/manpage/v1" +version: "1.0.0" +--- + +# Manpage Schema v1.0 + +[Documentation...] + +## Schema Definition + +```json +{ ... JSON schema ... } +``` +``` + +### Schema Migration Plan +``` +Old → New +────────────────────────────────────────────────── +terminology-schema.json → terminology-schema-v1.0.md +api-documentation → api-documentation-schema-v1.0.md +enhanced-manpage → manpage-schema-v2.0.md +markdown-manpage → REMOVE (duplicate) +markdown-manpage-schema.json → REMOVE (duplicate) +``` + +## Progress Tracking + +Track progress in: `roadmap/schema-of-schemas/IMPLEMENTATION_LOG.md` (to be created) + +## Questions? + +See the full workplan for detailed implementation steps, risks, and mitigation strategies. diff --git a/roadmap/schema-of-schemas/SCHEMA_MANAGEMENT_PROPOSAL.md b/roadmap/schema-of-schemas/SCHEMA_MANAGEMENT_PROPOSAL.md new file mode 100644 index 00000000..c2c39d2b --- /dev/null +++ b/roadmap/schema-of-schemas/SCHEMA_MANAGEMENT_PROPOSAL.md @@ -0,0 +1,569 @@ +# Schema Management Proposal + +**Status:** Draft +**Created:** 2026-01-04 +**Author:** Analysis of current state and proposed improvements + +## Problem Statement + +### 1. Inconsistent Schema Naming + +**Current State:** +``` +terminology-schema.json ← Has ".json" suffix +api-documentation ← No suffix +enhanced-manpage ← No suffix +markdown-manpage ← No suffix, duplicate title +markdown-manpage-schema.json ← Has ".json" suffix, duplicate title +``` + +**Issues:** +- No naming convention enforced +- Duplicate schemas (3 manpage schemas!) +- Mix of suffixed (.json) and non-suffixed names +- No way to distinguish versions + +### 2. Missing Versioning + +**Current State:** +- No version in filenames +- No version in schema metadata (beyond optional `$id`) +- No way to track schema evolution +- Breaking changes not apparent + +**Issues:** +- Can't have multiple versions simultaneously +- No migration path when schemas change +- Unclear which schema version a document uses + +### 3. Format Mismatch: JSON vs Markdown + +**The Philosophical Problem:** +> MarkiTect is a markdown-centric tool, yet schemas are JSON files. +> This creates a conceptual and practical mismatch. + +**Current State:** +- Documents: Markdown (.md) +- Schemas: JSON (.json) +- No unified format for documentation + schema +- Schemas lack rich documentation capabilities + +## Proposed Solutions + +### Part 1: Naming Convention & Versioning + +#### Option A: Filename-Based Versioning (Recommended) + +**Format:** `{domain}-{type}-schema-v{major}.{minor}.json` + +**Examples:** +``` +manpage-schema-v1.0.json # Manpage schema v1.0 +manpage-schema-v2.0.json # Breaking change → v2.0 +terminology-schema-v1.0.json # Terminology schema +api-documentation-schema-v1.0.json +arc42-schema-v1.0.json +``` + +**Benefits:** +- Clear versioning in filename +- Easy to see multiple versions +- SemVer compatible (major.minor) +- Searchable/sortable + +**Migration Strategy:** +```bash +# Rename existing schemas +markdown-manpage → manpage-schema-v1.0.json +enhanced-manpage → manpage-schema-v2.0.json # (breaking changes) +terminology-schema.json → terminology-schema-v1.0.json +``` + +#### Option B: $id-Based Versioning + +**Keep simple filenames, use `$id` for versioning:** + +```json +{ + "$id": "https://markitect.dev/schemas/manpage/v1", + "$schema": "http://json-schema.org/draft-07/schema#", + "version": "1.0.0", + ... +} +``` + +**Filenames:** `manpage-schema.json`, `terminology-schema.json` + +**Benefits:** +- Clean filenames +- Versioning in metadata +- Follows JSON Schema best practices + +**Drawbacks:** +- Can't have multiple versions in same database +- Harder to see versions at a glance + +#### Recommendation: **Hybrid Approach** + +Combine both for maximum clarity: + +```json +// File: manpage-schema-v1.json +{ + "$id": "https://markitect.dev/schemas/manpage/v1", + "$schema": "http://json-schema.org/draft-07/schema#", + "version": "1.0.0", + "title": "Unix Manual Page Schema", + ... +} +``` + +### Part 2: Schema Metadata Standard + +Add required metadata to all schemas: + +```json +{ + "$schema": "http://json-schema.org/draft-07/schema#", + "$id": "https://markitect.dev/schemas/{domain}/v{major}", + + // Required metadata + "version": "1.0.0", // SemVer + "title": "Human Readable Title", + "description": "Detailed description", + + // Optional metadata + "x-markitect-schema-type": "document-schema", + "x-markitect-version": { + "major": 1, + "minor": 0, + "patch": 0 + }, + "x-markitect-author": "MarkiTect Project", + "x-markitect-created": "2026-01-04", + "x-markitect-updated": "2026-01-04", + "x-markitect-deprecated": false, + "x-markitect-superseded-by": null, + "x-markitect-document-types": ["manpage", "manual"], + "x-markitect-example": "examples/manpages/example.md", + + // Schema content + "type": "object", + "properties": { ... } +} +``` + +### Part 3: Format Mismatch Solutions + +#### Option 1: Markdown-First with Embedded JSON (Recommended) + +**File Format:** Markdown with frontmatter and JSON code block + +```markdown +--- +schema-version: "1.0.0" +schema-id: "https://markitect.dev/schemas/manpage/v1" +document-type: manpage +status: stable +--- + +# Manpage Schema v1.0 + +## Overview + +This schema validates Unix/Linux manual page documentation following +standard conventions (SYNOPSIS, DESCRIPTION, OPTIONS, etc.). + +## Document Types + +- Manual pages (man pages) +- CLI command documentation +- API reference pages + +## Usage + +\`\`\`bash +markitect validate mycommand.1.md --schema manpage-schema-v1 +\`\`\` + +## Examples + +See [examples/manpages/](../../examples/manpages/) for complete examples. + +## Schema Definition + +\`\`\`json +{ + "$schema": "http://json-schema.org/draft-07/schema#", + "$id": "https://markitect.dev/schemas/manpage/v1", + "version": "1.0.0", + "title": "Unix Manual Page Schema", + "type": "object", + "properties": { + "headings": { + "type": "object", + "properties": { + "level_1": { ... } + } + } + }, + "x-markitect-sections": { ... } +} +\`\`\` + +## Validation Rules + +### Required Sections +- **NAME** - Command name and brief description +- **SYNOPSIS** - Command syntax +- **DESCRIPTION** - Detailed description + +### Optional Sections +- **OPTIONS** - Command-line options +- **EXAMPLES** - Usage examples +- **SEE ALSO** - Related commands + +## Version History + +### v1.0.0 (2026-01-04) +- Initial release +- Basic manpage structure validation +``` + +**Implementation:** +```python +class MarkdownSchemaLoader: + """Load schemas from markdown files with embedded JSON.""" + + def load_schema_from_markdown(self, md_path: Path) -> dict: + """Extract JSON schema from markdown file.""" + content = md_path.read_text() + + # Parse frontmatter + frontmatter = self._extract_frontmatter(content) + + # Extract JSON from code block + schema_json = self._extract_json_from_code_block(content) + + # Merge metadata + schema = json.loads(schema_json) + schema['x-markitect-metadata'] = frontmatter + + return schema + + def save_schema_to_markdown(self, schema: dict, md_path: Path): + """Save schema as markdown with embedded JSON.""" + # Generate markdown documentation + doc = self._generate_schema_documentation(schema) + + # Embed JSON schema + json_block = f"```json\n{json.dumps(schema, indent=2)}\n```" + + # Combine + full_content = f"{doc}\n\n## Schema Definition\n\n{json_block}" + md_path.write_text(full_content) +``` + +**Benefits:** +- ✅ Markdown-first (aligns with MarkiTect philosophy) +- ✅ Rich documentation alongside schema +- ✅ Human-readable and editable +- ✅ Version history in same file +- ✅ Examples and usage inline +- ✅ Can extract JSON when needed + +**Drawbacks:** +- ⚠️ Requires parsing logic +- ⚠️ Two sources of truth (markdown + embedded JSON) +- ⚠️ More complex than pure JSON + +#### Option 2: Markdown Documentation Generator + +**Keep JSON schemas, auto-generate markdown docs:** + +``` +schemas/ + manpage-schema-v1.json # Source of truth + manpage-schema-v1.md # Auto-generated docs +``` + +**Command:** +```bash +markitect schema-document manpage-schema-v1.json +# Generates: manpage-schema-v1.md +``` + +**Benefits:** +- ✅ Simple implementation +- ✅ JSON remains source of truth +- ✅ Auto-generated docs always in sync + +**Drawbacks:** +- ⚠️ Two files to manage +- ⚠️ Can't hand-edit documentation (gets overwritten) + +#### Option 3: Markdown Schema Language (DSL) + +**Define schemas in markdown-native syntax:** + +```markdown +# Manpage Schema v1.0 + +## Document Structure + +### Required Sections (Level 1 Heading) + +**NAME** +- Classification: required +- Content: Command name in bold, followed by description +- Pattern: `**command** - description` + +**SYNOPSIS** +- Classification: required +- Content: Command syntax with options +- Min paragraphs: 1 +- Max paragraphs: 3 + +### Optional Sections + +**OPTIONS** +- Classification: recommended +- Content: Definition list of command-line options +``` + +**Parser generates JSON schema from markdown:** + +```bash +markitect schema-compile manpage-schema-v1.md --output manpage-schema-v1.json +``` + +**Benefits:** +- ✅ Pure markdown +- ✅ Human-friendly syntax +- ✅ No JSON editing needed + +**Drawbacks:** +- ⚠️ Complex parser implementation +- ⚠️ Limited to MarkiTect-specific features +- ⚠️ Can't use standard JSON Schema tools + +#### Option 4: Literate Schema Programming + +**Inspired by literate programming, mix documentation and schema:** + +```markdown +# Manpage Schema v1.0 + +Manual pages follow a standard structure. The NAME section is required: + +<>= +{ + "NAME": { + "classification": "required", + "heading_level": 2, + "content_instruction": "Command name and brief description" + } +} + +The SYNOPSIS section shows command syntax: + +<>= +{ + "SYNOPSIS": { + "classification": "required", + "heading_level": 2, + "min_code_blocks": 1 + } +} + +Complete schema: + +<>= +{ + "$schema": "http://json-schema.org/draft-07/schema#", + "x-markitect-sections": { + <>, + <> + } +} +``` + +**Benefits:** +- ✅ Documentation and schema interleaved +- ✅ Literate programming benefits +- ✅ Reusable schema fragments + +**Drawbacks:** +- ⚠️ Complex tangling/weaving +- ⚠️ Unfamiliar paradigm +- ⚠️ Overkill for simple schemas + +## Recommendations + +### Short-Term (Immediate) + +1. **Naming Convention:** + - Format: `{domain}-schema-v{major}.{minor}.json` + - Example: `manpage-schema-v1.0.json` + +2. **Schema Metadata:** + - Add required `version`, `title`, `description` fields + - Add `x-markitect-*` metadata extensions + - Document in schema-catalog.yaml + +3. **Duplicate Cleanup:** + - Consolidate 3 manpage schemas into versioned series + - Keep enhanced-manpage as v2.0 (breaking changes) + - Archive old schemas + +### Medium-Term (Next Phase) + +4. **Markdown Schema Format (Option 1):** + - Implement markdown-first schema format + - Markdown file with embedded JSON in code block + - Parser extracts JSON for validation + - Rich documentation alongside schema + +5. **Schema Documentation Generator:** + - Auto-generate markdown docs from JSON schemas + - Include examples, usage, version history + - Link to example documents + +### Long-Term (Future) + +6. **Schema DSL (Option 3):** + - Evaluate markdown schema language + - Prototype parser for common patterns + - Consider if DSL adds value over JSON + +7. **Schema Registry API:** + - REST API for schema discovery + - Version negotiation + - Schema evolution tracking + +## Implementation Plan + +### Phase 1: Naming & Versioning (1-2 days) + +**Tasks:** +1. Define naming convention spec +2. Create schema metadata template +3. Rename existing schemas +4. Update schema-catalog.yaml +5. Update documentation + +**Deliverables:** +- Schema naming convention spec +- Migrated schemas with versions +- Updated catalog + +### Phase 2: Markdown Schema Format (3-5 days) + +**Tasks:** +1. Design markdown schema format +2. Implement parser (extract JSON from markdown) +3. Implement generator (create markdown from JSON) +4. Convert existing schemas to markdown format +5. Update CLI to support .md schemas +6. Write documentation and examples + +**Deliverables:** +- Markdown schema parser/generator +- All schemas in markdown format +- Updated CLI commands +- Migration guide + +### Phase 3: Schema Validation (2-3 days) + +**Tasks:** +1. Create metaschema for validating schemas +2. Add schema validation command +3. Validate all existing schemas +4. Add CI check for schema validity + +**Deliverables:** +- Schema-for-schemas (metaschema) +- Validation command +- CI integration + +## Cost-Benefit Analysis + +### Option 1: Markdown-First (Recommended) + +**Cost:** +- Parser implementation: ~200 lines +- CLI updates: ~100 lines +- Migration effort: 2-3 days +- Testing: 1 day + +**Benefit:** +- Aligned with markdown philosophy ⭐⭐⭐⭐⭐ +- Rich documentation ⭐⭐⭐⭐⭐ +- Version history inline ⭐⭐⭐⭐ +- Human-friendly ⭐⭐⭐⭐⭐ +- Lower barrier to entry ⭐⭐⭐⭐ + +**Total:** High value for reasonable cost + +### Option 2: Documentation Generator + +**Cost:** +- Generator implementation: ~150 lines +- Template design: 1 day +- Testing: 0.5 days + +**Benefit:** +- Simple implementation ⭐⭐⭐⭐ +- Auto-sync docs ⭐⭐⭐⭐ +- JSON remains source ⭐⭐⭐ + +**Total:** Good value, lower cost + +### Option 3: Schema DSL + +**Cost:** +- DSL design: 2-3 days +- Parser implementation: ~500 lines +- Compiler: ~300 lines +- Testing: 2 days +- Documentation: 1 day + +**Benefit:** +- Pure markdown ⭐⭐⭐⭐⭐ +- No JSON editing ⭐⭐⭐⭐ +- Limited ecosystem ⭐⭐ + +**Total:** High cost, uncertain value + +## Decision Matrix + +| Criterion | Option 1: Markdown-First | Option 2: Doc Generator | Option 3: DSL | +|-----------|-------------------------|------------------------|---------------| +| Markdown alignment | ⭐⭐⭐⭐⭐ | ⭐⭐⭐ | ⭐⭐⭐⭐⭐ | +| Implementation cost | ⭐⭐⭐⭐ | ⭐⭐⭐⭐⭐ | ⭐⭐ | +| Documentation quality | ⭐⭐⭐⭐⭐ | ⭐⭐⭐⭐ | ⭐⭐⭐⭐ | +| Tool ecosystem | ⭐⭐⭐⭐ | ⭐⭐⭐⭐⭐ | ⭐⭐ | +| Maintainability | ⭐⭐⭐⭐ | ⭐⭐⭐⭐⭐ | ⭐⭐⭐ | +| User-friendliness | ⭐⭐⭐⭐⭐ | ⭐⭐⭐ | ⭐⭐⭐⭐ | + +## Recommendation Summary + +1. **Immediate:** Implement naming convention and versioning +2. **Short-term:** Choose **Option 1 (Markdown-First)** for schema format +3. **Fallback:** If Option 1 proves too complex, use **Option 2 (Doc Generator)** +4. **Future:** Evaluate DSL if community demand emerges + +## Next Steps + +1. Review and approve this proposal +2. Create naming convention specification +3. Prototype markdown schema parser +4. Migrate one schema as proof-of-concept +5. Gather feedback and iterate +6. Full migration of all schemas + +--- + +## Appendix: Example Markdown Schema + +See `examples/schemas/manpage-schema-v1.md` for a complete example of the proposed format. diff --git a/roadmap/schema-of-schemas/SCHEMA_MANAGEMENT_SUMMARY.md b/roadmap/schema-of-schemas/SCHEMA_MANAGEMENT_SUMMARY.md new file mode 100644 index 00000000..1d9b8990 --- /dev/null +++ b/roadmap/schema-of-schemas/SCHEMA_MANAGEMENT_SUMMARY.md @@ -0,0 +1,154 @@ +# Schema Management: Executive Summary + +**TL;DR:** Implement naming conventions, versioning, and markdown-first schema format to solve current schema management issues. + +## Problems Identified + +1. **Inconsistent Naming** - Mix of `schema.json` suffix and no suffix +2. **No Versioning** - Can't track schema evolution or maintain multiple versions +3. **Duplicate Schemas** - 3 manpage schemas with similar content +4. **Format Mismatch** - JSON schemas in markdown-centric tool + +## Recommended Solution + +### 1. Naming Convention (Immediate) + +**Format:** `{domain}-schema-v{major}.{minor}.json` or `.md` + +**Examples:** +``` +manpage-schema-v1.0.json +terminology-schema-v1.0.json +api-documentation-schema-v1.0.json +``` + +**Migration:** +``` +markdown-manpage → manpage-schema-v1.0.json +enhanced-manpage → manpage-schema-v2.0.json (breaking changes) +terminology-schema.json → terminology-schema-v1.0.json +``` + +### 2. Markdown-First Format (Short-term) + +**Proposal:** Store schemas as markdown files with embedded JSON + +**Benefits:** +- Aligns with markdown philosophy ✅ +- Rich documentation alongside schema ✅ +- Version history in same file ✅ +- Examples and usage inline ✅ +- Lower barrier to entry ✅ + +**Example:** See `examples/schemas/manpage-schema-v1.md` + +**Format:** +```markdown +# Schema Title v1.0 + +## Documentation sections... + +## Schema Definition + +\`\`\`json +{ schema here } +\`\`\` +``` + +### 3. Schema Metadata Standard (Immediate) + +**Required fields:** +```json +{ + "$schema": "http://json-schema.org/draft-07/schema#", + "$id": "https://markitect.dev/schemas/{domain}/v{major}", + "version": "1.0.0", + "title": "Human Readable Title", + "description": "Detailed description", + "x-markitect-metadata": { + "domain": "manpage", + "document-types": ["manual-page"], + "created": "2026-01-04", + "example": "examples/manpages/example.md" + } +} +``` + +## Implementation Phases + +### Phase 1: Foundation (1-2 days) +- [x] Analyze current state +- [ ] Define naming convention spec +- [ ] Create schema metadata template +- [ ] Rename existing schemas +- [ ] Update schema-catalog.yaml + +### Phase 2: Markdown Format (3-5 days) +- [ ] Design markdown schema format +- [ ] Implement parser (extract JSON from markdown) +- [ ] Convert 1 schema as proof-of-concept +- [ ] Test and iterate +- [ ] Migrate all schemas + +### Phase 3: Tooling (2-3 days) +- [ ] Update CLI to support .md schemas +- [ ] Add schema validation command +- [ ] Create migration guide +- [ ] Update documentation + +## Cost-Benefit Analysis + +**Cost:** 6-10 days total effort + +**Benefits:** +- Professional schema management ⭐⭐⭐⭐⭐ +- Better discoverability ⭐⭐⭐⭐ +- Easier maintenance ⭐⭐⭐⭐⭐ +- Markdown alignment ⭐⭐⭐⭐⭐ +- Version tracking ⭐⭐⭐⭐⭐ + +**ROI:** High - Foundational improvement that benefits all future schema work + +## Alternative Considered + +**Alternative:** Keep JSON, generate markdown docs automatically + +**Pros:** +- Simpler implementation (2-3 days) +- JSON remains source of truth +- Standard tooling works + +**Cons:** +- Doesn't solve format mismatch +- Documentation generated, not authored +- Two files to manage + +**Verdict:** Markdown-first better aligns with project philosophy + +## Quick Wins (Today) + +1. **Rename schemas** with versioned names (30 minutes) +2. **Add metadata** to existing schemas (1 hour) +3. **Update catalog** with proper versioning (30 minutes) + +## Questions to Resolve + +1. **File extension:** `.md` or `.schema.md` for markdown schemas? +2. **JSON extraction:** Real-time or pre-compiled cache? +3. **Backward compatibility:** Support both formats during transition? +4. **CLI changes:** `--schema file.md` or auto-detect format? + +## Next Steps + +1. **Review** this proposal and example (examples/schemas/manpage-schema-v1.md) +2. **Decide** on markdown-first vs generated docs approach +3. **Prototype** parser for markdown schemas +4. **Migrate** one schema as proof-of-concept +5. **Iterate** based on feedback +6. **Full rollout** to all schemas + +## References + +- Full proposal: [SCHEMA_MANAGEMENT_PROPOSAL.md](./SCHEMA_MANAGEMENT_PROPOSAL.md) +- Example markdown schema: [examples/schemas/manpage-schema-v1.md](../../examples/schemas/manpage-schema-v1.md) +- Current schema catalog: [markitect/schemas/schema-catalog.yaml](../../markitect/schemas/schema-catalog.yaml) diff --git a/roadmap/schema-of-schemas/WORKPLAN.md b/roadmap/schema-of-schemas/WORKPLAN.md new file mode 100644 index 00000000..65fec4bf --- /dev/null +++ b/roadmap/schema-of-schemas/WORKPLAN.md @@ -0,0 +1,962 @@ +# Schema-of-Schemas Implementation Workplan + +**Project:** Implement Markdown-First Schema System with Self-Description +**Created:** 2026-01-04 +**Status:** Planning +**Duration:** 6-10 days +**Priority:** High - Foundation for all schema work + +## Executive Summary + +This workplan implements a comprehensive schema management system: +1. Filename conventions and versioning +2. Markdown-first schema format (`.md` with embedded JSON) +3. Schema-for-schemas (metaschema) for validation +4. Migration of existing schemas +5. Cleanup of legacy schemas from registry + +## Project Goals + +### Primary Goals +- [x] Establish filename convention: `{domain}-schema-v{version}.md` +- [ ] Implement markdown schema parser (extract JSON from markdown) +- [ ] Create schema-for-schemas to validate all schemas +- [ ] Migrate existing schemas to new format +- [ ] Remove legacy/duplicate schemas from registry + +### Success Criteria +- ✅ All schemas follow naming convention +- ✅ Schemas stored as markdown files with embedded JSON +- ✅ Schema-for-schemas validates all schemas successfully +- ✅ No duplicate schemas in registry +- ✅ CLI commands work with `.md` schema files +- ✅ Documentation updated + +## Architecture Overview + +### Current State +``` +Schemas: JSON files (.json) +Naming: Inconsistent (api-documentation, markdown-manpage-schema.json) +Versioning: None +Documentation: Separate or missing +Registry: Database with 5 schemas (3 duplicates) +``` + +### Target State +``` +Schemas: Markdown files (.md) with embedded JSON +Naming: {domain}-schema-v{major}.{minor}.md +Versioning: SemVer in filename and metadata +Documentation: Inline with schema +Registry: Clean, versioned, no duplicates +Validation: Schema-for-schemas validates all schemas +``` + +### Components to Build + +``` +markitect/ +├── schema_loader.py # NEW: Load schemas from markdown +├── schema_validator.py # UPDATED: Support .md schemas +├── cli.py # UPDATED: Accept .md schema files +└── schemas/ + ├── schema-schema-v1.md # NEW: Schema-for-schemas + └── ...versioned schemas... + +examples/schemas/ # Markdown schema examples +└── manpage-schema-v1.md # Already created + +roadmap/schema-of-schemas/ # Planning artifacts +├── WORKPLAN.md # This file +├── SCHEMA_NAMING_SPEC.md # Naming convention spec +└── IMPLEMENTATION_LOG.md # Progress tracking +``` + +## Phase Breakdown + +### Phase 0: Planning & Setup ✅ (0.5 days) + +**Goal:** Establish project structure and specifications + +**Tasks:** +- [x] Create roadmap/schema-of-schemas directory +- [x] Move planning documents to roadmap +- [ ] Write naming convention specification +- [ ] Document schema metadata standard +- [ ] Create implementation checklist + +**Deliverables:** +- [x] Directory structure +- [ ] SCHEMA_NAMING_SPEC.md +- [ ] SCHEMA_METADATA_SPEC.md +- [ ] This workplan + +**Duration:** 0.5 days +**Status:** In Progress + +--- + +### Phase 1: Filename Convention & Validation (1 day) + +**Goal:** Establish and enforce filename conventions + +**1.1 Define Naming Convention** + +**Specification:** +``` +Format: {domain}-schema-v{major}.{minor}.md + +Components: +- domain: lowercase, hyphen-separated (e.g., "manpage", "api-documentation") +- schema: literal string "schema" +- version: SemVer major.minor (e.g., "v1.0", "v2.1") +- extension: ".md" (markdown) + +Examples: +✓ manpage-schema-v1.0.md +✓ terminology-schema-v1.0.md +✓ api-documentation-schema-v1.0.md +✗ manpage.json (missing version) +✗ manpage-v1.md (missing "schema") +✗ ManPage-Schema-v1.0.md (wrong case) +``` + +**1.2 Implement Validation Function** + +**File:** `markitect/schema_naming.py` (NEW) + +```python +import re +from pathlib import Path +from typing import Tuple, Optional + +SCHEMA_FILENAME_PATTERN = re.compile( + r'^(?P[a-z][a-z0-9-]*)-schema-v(?P\d+)\.(?P\d+)\.md$' +) + +def validate_schema_filename(filename: str) -> Tuple[bool, Optional[dict]]: + """ + Validate schema filename against convention. + + Returns: + (is_valid, metadata_dict) + """ + match = SCHEMA_FILENAME_PATTERN.match(filename) + if not match: + return False, None + + return True, { + 'domain': match.group('domain'), + 'version': f"{match.group('major')}.{match.group('minor')}", + 'major': int(match.group('major')), + 'minor': int(match.group('minor')) + } + +def suggest_schema_filename(domain: str, version: str) -> str: + """Generate correct schema filename from domain and version.""" + # Normalize domain: lowercase, replace spaces with hyphens + domain_clean = domain.lower().replace(' ', '-').replace('_', '-') + return f"{domain_clean}-schema-v{version}.md" +``` + +**1.3 Add CLI Validation** + +**Update:** `markitect/cli.py` - schema-ingest command + +```python +@cli.command('schema-ingest') +@click.argument('schema_file', type=click.Path(exists=True, path_type=Path)) +@click.option('--force', is_flag=True, help='Skip filename validation') +def schema_ingest(config, schema_file, force): + """Ingest schema file with filename validation.""" + from .schema_naming import validate_schema_filename, suggest_schema_filename + + filename = schema_file.name + is_valid, metadata = validate_schema_filename(filename) + + if not is_valid and not force: + click.echo(f"❌ Invalid schema filename: {filename}", err=True) + click.echo("\nExpected format: {domain}-schema-v{major}.{minor}.md") + click.echo("Example: manpage-schema-v1.0.md") + + # Try to suggest correct name + # ... extract domain/version from file content ... + suggestion = suggest_schema_filename(domain, version) + click.echo(f"\nSuggested filename: {suggestion}") + click.echo("\nUse --force to skip validation") + sys.exit(1) + + # Continue with ingestion... +``` + +**Tasks:** +- [ ] Write `markitect/schema_naming.py` +- [ ] Add unit tests for filename validation +- [ ] Update `schema-ingest` command with validation +- [ ] Test with valid and invalid filenames + +**Deliverables:** +- [ ] schema_naming.py with validation logic +- [ ] Unit tests (tests/test_schema_naming.py) +- [ ] Updated CLI with validation +- [ ] SCHEMA_NAMING_SPEC.md documentation + +**Duration:** 1 day + +--- + +### Phase 2: Markdown Schema Loader (2-3 days) + +**Goal:** Parse markdown files to extract JSON schemas + +**2.1 Design Markdown Schema Format** + +**Format Specification:** + +```markdown +--- +schema-id: "https://markitect.dev/schemas/{domain}/v{major}" +version: "{major}.{minor}.{patch}" +status: "stable|draft|deprecated" +--- + +# {Title} v{version} + +## Overview +[Human-readable description] + +## Usage +[Examples of how to use this schema] + +## Schema Definition + +```json +{ + "$schema": "http://json-schema.org/draft-07/schema#", + "$id": "https://markitect.dev/schemas/{domain}/v{major}", + "version": "{major}.{minor}.{patch}", + ... +} +\``` + +## Validation Rules +[Explanation of schema rules] + +## Version History +[Changelog] +``` + +**2.2 Implement Markdown Schema Loader** + +**File:** `markitect/schema_loader.py` (NEW) + +```python +""" +Schema Loader - Extract JSON schemas from markdown files. + +Supports: +- YAML frontmatter for metadata +- JSON code block for schema definition +- Validation of schema structure +""" + +import re +import json +import yaml +from pathlib import Path +from typing import Dict, Any, Optional, Tuple + +class MarkdownSchemaLoader: + """Load and parse markdown schema files.""" + + def __init__(self): + self.frontmatter_pattern = re.compile( + r'^---\s*\n(.*?)\n---\s*\n', + re.DOTALL | re.MULTILINE + ) + self.json_code_block_pattern = re.compile( + r'```json\s*\n(.*?)\n```', + re.DOTALL | re.MULTILINE + ) + + def load_schema(self, md_path: Path) -> Dict[str, Any]: + """ + Load schema from markdown file. + + Returns: + { + 'schema': {...}, # Extracted JSON schema + 'metadata': {...}, # Frontmatter metadata + 'documentation': '...' # Full markdown content + } + """ + if not md_path.exists(): + raise FileNotFoundError(f"Schema file not found: {md_path}") + + content = md_path.read_text(encoding='utf-8') + + # Extract frontmatter + metadata = self._extract_frontmatter(content) + + # Extract JSON schema + schema = self._extract_json_schema(content) + + if not schema: + raise ValueError(f"No JSON schema found in {md_path}") + + # Merge metadata into schema + schema = self._merge_metadata(schema, metadata, md_path) + + return { + 'schema': schema, + 'metadata': metadata, + 'documentation': content, + 'source_file': str(md_path) + } + + def _extract_frontmatter(self, content: str) -> Dict[str, Any]: + """Extract YAML frontmatter from markdown.""" + match = self.frontmatter_pattern.search(content) + if not match: + return {} + + try: + return yaml.safe_load(match.group(1)) or {} + except yaml.YAMLError as e: + raise ValueError(f"Invalid YAML frontmatter: {e}") + + def _extract_json_schema(self, content: str) -> Optional[Dict[str, Any]]: + """Extract JSON schema from code block.""" + matches = self.json_code_block_pattern.findall(content) + + if not matches: + return None + + # Use the first JSON code block as schema + # (or could look for specific heading like "## Schema Definition") + try: + return json.loads(matches[0]) + except json.JSONDecodeError as e: + raise ValueError(f"Invalid JSON schema: {e}") + + def _merge_metadata( + self, + schema: Dict[str, Any], + metadata: Dict[str, Any], + source_file: Path + ) -> Dict[str, Any]: + """Merge frontmatter metadata into schema.""" + + # Add MarkiTect-specific metadata + schema['x-markitect-source'] = { + 'file': str(source_file), + 'format': 'markdown', + 'frontmatter': metadata + } + + # Override schema fields with frontmatter if present + if 'version' in metadata: + schema['version'] = metadata['version'] + if 'schema-id' in metadata: + schema['$id'] = metadata['schema-id'] + + return schema + + def save_schema( + self, + schema: Dict[str, Any], + md_path: Path, + template: Optional[str] = None + ): + """ + Save schema as markdown file. + + Args: + schema: JSON schema dict + md_path: Output path + template: Optional markdown template + """ + if template: + # Use provided template + content = self._render_template(template, schema) + else: + # Generate basic markdown + content = self._generate_markdown(schema) + + md_path.write_text(content, encoding='utf-8') + + def _generate_markdown(self, schema: Dict[str, Any]) -> str: + """Generate markdown from schema.""" + title = schema.get('title', 'Untitled Schema') + version = schema.get('version', '1.0.0') + description = schema.get('description', '') + + # Generate frontmatter + frontmatter = yaml.dump({ + 'schema-id': schema.get('$id', ''), + 'version': version, + 'status': 'draft' + }, default_flow_style=False) + + # Generate markdown + md = f"""--- +{frontmatter}--- + +# {title} v{version} + +## Overview + +{description} + +## Schema Definition + +```json +{json.dumps(schema, indent=2)} +``` + +## Version History + +### v{version} +- Initial version +""" + return md + + +class SchemaLoaderError(Exception): + """Base exception for schema loading errors.""" + pass +``` + +**2.3 Update Schema Validator** + +**Update:** `markitect/schema_validator.py` + +```python +from .schema_loader import MarkdownSchemaLoader + +class SchemaValidator: + def __init__(self): + self.schema_generator = SchemaGenerator() + self.jsonschema_available = JSONSCHEMA_AVAILABLE + self.md_loader = MarkdownSchemaLoader() # NEW + + def validate_file_against_schema_file( + self, + file_path: Path, + schema_file_path: Path + ) -> bool: + """Validate file against schema (supports .json and .md).""" + + # Detect schema file format + if schema_file_path.suffix == '.md': + # Load from markdown + schema_data = self.md_loader.load_schema(schema_file_path) + schema = schema_data['schema'] + else: + # Load from JSON (legacy) + schema_content = schema_file_path.read_text(encoding='utf-8') + schema = json.loads(schema_content) + + return self.validate_file_against_schema(file_path, schema) +``` + +**Tasks:** +- [ ] Implement MarkdownSchemaLoader class +- [ ] Add frontmatter extraction (YAML) +- [ ] Add JSON code block extraction +- [ ] Add metadata merging logic +- [ ] Write comprehensive unit tests +- [ ] Update SchemaValidator to use loader +- [ ] Test with example markdown schemas + +**Deliverables:** +- [ ] schema_loader.py implementation +- [ ] Unit tests (tests/test_schema_loader.py) +- [ ] Updated schema_validator.py +- [ ] Integration tests + +**Duration:** 2-3 days + +--- + +### Phase 3: Schema-for-Schemas (2 days) + +**Goal:** Create metaschema to validate all schema files + +**3.1 Design Schema-for-Schemas** + +**File:** `markitect/schemas/schema-schema-v1.md` + +**Purpose:** Validates that schema files follow MarkiTect conventions + +**Validates:** +- Required fields ($schema, $id, version, title, description) +- Version format (SemVer) +- $id URL format +- x-markitect-* extensions +- Section classifications +- Content control structures + +**3.2 Implement Schema-for-Schemas** + +See separate file: `roadmap/schema-of-schemas/schema-schema-v1.md` (to be created) + +**3.3 Add Schema Validation Command** + +**New CLI command:** `markitect schema-validate` + +```python +@cli.command('schema-validate') +@click.argument('schema_file', type=click.Path(exists=True, path_type=Path)) +@click.option('--detailed-errors', is_flag=True) +def schema_validate(config, schema_file, detailed_errors): + """ + Validate a schema file against the schema-for-schemas. + + Ensures schema files follow MarkiTect conventions and standards. + """ + from .schema_loader import MarkdownSchemaLoader + from .schema_validator import SchemaValidator + + loader = MarkdownSchemaLoader() + validator = SchemaValidator() + + # Load the schema + try: + schema_data = loader.load_schema(schema_file) + schema = schema_data['schema'] + except Exception as e: + click.echo(f"❌ Failed to load schema: {e}", err=True) + sys.exit(1) + + # Load schema-for-schemas + metaschema_path = Path(__file__).parent / 'schemas' / 'schema-schema-v1.md' + metaschema_data = loader.load_schema(metaschema_path) + metaschema = metaschema_data['schema'] + + # Validate + is_valid = validator.validate_schema_against_metaschema(schema, metaschema) + + if is_valid: + click.echo(f"✅ Schema is valid: {schema_file.name}") + click.echo(f" Title: {schema.get('title')}") + click.echo(f" Version: {schema.get('version')}") + else: + click.echo(f"❌ Schema validation failed: {schema_file.name}", err=True) + if detailed_errors: + # Show detailed errors + pass + sys.exit(1) +``` + +**Tasks:** +- [ ] Design schema-for-schemas structure +- [ ] Implement schema-schema-v1.md +- [ ] Add schema validation logic +- [ ] Create `schema-validate` CLI command +- [ ] Test all existing schemas against metaschema +- [ ] Document validation rules + +**Deliverables:** +- [ ] schema-schema-v1.md (metaschema) +- [ ] schema-validate command +- [ ] Validation documentation +- [ ] Test suite + +**Duration:** 2 days + +--- + +### Phase 4: Schema Migration (1-2 days) + +**Goal:** Convert existing schemas to new format + +**4.1 Inventory Current Schemas** + +Current schemas in database: +``` +1. terminology-schema.json → terminology-schema-v1.0.md +2. api-documentation → api-documentation-schema-v1.0.md +3. enhanced-manpage → manpage-schema-v2.0.md +4. markdown-manpage → manpage-schema-v1.0.md (DUPLICATE) +5. markdown-manpage-schema.json → manpage-schema-v1.0.md (DUPLICATE) +``` + +**Decision matrix:** +- Keep enhanced-manpage as v2.0 (has classifications) +- Merge markdown-manpage variants into v1.0 +- Update terminology to v1.0 +- Update api-documentation to v1.0 + +**4.2 Create Migration Script** + +**File:** `scripts/migrate_schemas.py` + +```python +#!/usr/bin/env python3 +"""Migrate schemas to markdown format with versioning.""" + +from pathlib import Path +from markitect.schema_loader import MarkdownSchemaLoader +from markitect.database import DatabaseManager + +def migrate_schema( + db_manager: DatabaseManager, + old_name: str, + new_name: str, + version: str, + domain: str +): + """Migrate single schema to new format.""" + + # Get old schema from database + old_schema = db_manager.get_schema_file(old_name) + if not old_schema: + print(f"❌ Schema not found: {old_name}") + return + + schema_json = json.loads(old_schema['schema_content']) + + # Update metadata + schema_json['version'] = version + schema_json['$id'] = f"https://markitect.dev/schemas/{domain}/v{version.split('.')[0]}" + + # Save as markdown + loader = MarkdownSchemaLoader() + md_path = Path(f"markitect/schemas/{new_name}") + loader.save_schema(schema_json, md_path) + + print(f"✓ Migrated: {old_name} → {new_name}") + + # Ingest new schema + # ... ingest markdown schema to database ... + + return md_path + +def main(): + migrations = [ + ('terminology-schema.json', 'terminology-schema-v1.0.md', '1.0.0', 'terminology'), + ('api-documentation', 'api-documentation-schema-v1.0.md', '1.0.0', 'api-documentation'), + ('enhanced-manpage', 'manpage-schema-v2.0.md', '2.0.0', 'manpage'), + ('markdown-manpage', 'manpage-schema-v1.0.md', '1.0.0', 'manpage'), + ] + + db = DatabaseManager('markitect.db') + + for old, new, version, domain in migrations: + migrate_schema(db, old, new, version, domain) +``` + +**4.3 Execute Migration** + +```bash +# Run migration script +python scripts/migrate_schemas.py + +# Validate all new schemas +for schema in markitect/schemas/*-schema-v*.md; do + markitect schema-validate "$schema" +done + +# Ingest new schemas +for schema in markitect/schemas/*-schema-v*.md; do + markitect schema-ingest "$schema" +done +``` + +**4.4 Clean Up Registry** + +```bash +# Remove old schemas from database +markitect schema-delete markdown-manpage +markitect schema-delete markdown-manpage-schema.json +markitect schema-delete api-documentation +markitect schema-delete enhanced-manpage +markitect schema-delete terminology-schema.json + +# Verify cleanup +markitect schema-list +# Should show only versioned .md schemas +``` + +**Tasks:** +- [ ] Create schema inventory +- [ ] Write migration script +- [ ] Test migration on one schema +- [ ] Execute full migration +- [ ] Validate all migrated schemas +- [ ] Remove old schemas from database +- [ ] Update examples to use new schema names + +**Deliverables:** +- [ ] scripts/migrate_schemas.py +- [ ] All schemas migrated to markdown +- [ ] Clean registry (no duplicates) +- [ ] Migration report + +**Duration:** 1-2 days + +--- + +### Phase 5: CLI & Documentation Updates (1 day) + +**Goal:** Update CLI and documentation for new system + +**5.1 Update CLI Commands** + +Commands to update: +- `schema-ingest` - Accept .md files, validate filename +- `schema-list` - Show version in output +- `schema-get` - Export as .md or .json +- `validate` - Accept .md schema files +- `generate-stub` - Work with .md schemas +- `schema-generate` - Output .md format option +- NEW: `schema-validate` - Validate against metaschema + +**5.2 Update Documentation** + +Files to update: +- README.md - Mention markdown schemas +- examples/terminology/README.md - Use new schema name +- docs/specifications/schema-extensions-spec.md - Document markdown format +- Create: docs/guides/schema-authoring-guide.md + +**5.3 Add Schema Templates** + +**File:** `templates/schema-template-v1.md` + +```markdown +--- +schema-id: "https://markitect.dev/schemas/DOMAIN/v1" +version: "1.0.0" +status: "draft" +--- + +# TITLE Schema v1.0 + +## Overview + +[Description of what this schema validates] + +## Document Types + +- [Document type 1] +- [Document type 2] + +## Usage + +\`\`\`bash +markitect validate document.md --schema DOMAIN-schema-v1.0.md +\`\`\` + +## Examples + +See [examples/DOMAIN/example.md](../../examples/DOMAIN/example.md) + +## Schema Definition + +\`\`\`json +{ + "$schema": "http://json-schema.org/draft-07/schema#", + "$id": "https://markitect.dev/schemas/DOMAIN/v1", + "version": "1.0.0", + "title": "TITLE Schema", + "description": "Schema for validating DESCRIPTION", + "type": "object", + "properties": { + "headings": { + "type": "object" + } + }, + "x-markitect-sections": {}, + "x-markitect-content-control": {} +} +\`\`\` + +## Validation Rules + +### Required Sections +- **SECTION** - Description + +### Optional Sections +- **SECTION** - Description + +## Version History + +### v1.0.0 (YYYY-MM-DD) +- Initial release +``` + +**Tasks:** +- [ ] Update all CLI commands for .md support +- [ ] Update documentation +- [ ] Create schema authoring guide +- [ ] Add schema template +- [ ] Update examples +- [ ] Test all workflows end-to-end + +**Deliverables:** +- [ ] Updated CLI commands +- [ ] Schema authoring guide +- [ ] Schema template +- [ ] Updated examples +- [ ] End-to-end tests + +**Duration:** 1 day + +--- + +### Phase 6: Testing & Validation (1 day) + +**Goal:** Comprehensive testing of new system + +**6.1 Unit Tests** + +Test coverage for: +- `schema_naming.py` - Filename validation +- `schema_loader.py` - Markdown parsing +- `schema_validator.py` - Validation with .md schemas + +**6.2 Integration Tests** + +End-to-end workflows: +1. Create new schema in markdown format +2. Validate schema against schema-for-schemas +3. Ingest schema to database +4. Use schema to validate documents +5. Generate stub from schema +6. Export schema + +**6.3 Regression Tests** + +Ensure existing functionality still works: +- JSON schemas still load (backward compatibility) +- All existing documents validate +- Schema generation still works +- Stub generation still works + +**Tasks:** +- [ ] Write unit tests for new modules +- [ ] Create integration test suite +- [ ] Run regression tests +- [ ] Fix any issues found +- [ ] Achieve >80% code coverage +- [ ] Document test procedures + +**Deliverables:** +- [ ] Unit tests (>80% coverage) +- [ ] Integration tests +- [ ] Regression test suite +- [ ] Test documentation + +**Duration:** 1 day + +--- + +## Timeline + +``` +Week 1: + Day 1: Phase 0 (Planning) + Phase 1 (Naming Convention) + Day 2-3: Phase 2 (Markdown Loader) + Day 4-5: Phase 3 (Schema-for-Schemas) + +Week 2: + Day 6-7: Phase 4 (Migration) + Day 8: Phase 5 (CLI & Docs) + Day 9: Phase 6 (Testing) + Day 10: Buffer for issues/refinement +``` + +**Total:** 8-10 days + +## Risks & Mitigation + +### Risk 1: Parsing Complexity +**Risk:** Markdown parsing more complex than expected +**Probability:** Medium +**Impact:** High +**Mitigation:** +- Start with simple regex-based parser +- Test extensively with edge cases +- Have fallback to simpler format + +### Risk 2: Backward Compatibility +**Risk:** Breaking existing workflows +**Probability:** Low +**Impact:** High +**Mitigation:** +- Support both .json and .md during transition +- Provide migration script +- Test thoroughly with existing documents + +### Risk 3: Schema-for-Schemas Complexity +**Risk:** Self-referential validation complex +**Probability:** Medium +**Impact:** Medium +**Mitigation:** +- Start with simple metaschema +- Iterate based on actual schemas +- Don't over-engineer initially + +## Success Metrics + +- [ ] All schemas follow naming convention (5/5) +- [ ] All schemas in markdown format (5/5) +- [ ] All schemas validate against metaschema (5/5) +- [ ] Zero duplicate schemas in registry +- [ ] CLI commands work with .md schemas +- [ ] Documentation comprehensive +- [ ] Test coverage >80% +- [ ] No regression in existing functionality + +## Deliverables Checklist + +### Code +- [ ] markitect/schema_naming.py +- [ ] markitect/schema_loader.py +- [ ] markitect/schemas/schema-schema-v1.md +- [ ] scripts/migrate_schemas.py +- [ ] Updated CLI commands +- [ ] Unit tests +- [ ] Integration tests + +### Documentation +- [ ] SCHEMA_NAMING_SPEC.md +- [ ] SCHEMA_METADATA_SPEC.md +- [ ] Schema authoring guide +- [ ] Migration guide +- [ ] Updated examples +- [ ] IMPLEMENTATION_LOG.md + +### Schemas +- [ ] terminology-schema-v1.0.md +- [ ] api-documentation-schema-v1.0.md +- [ ] manpage-schema-v1.0.md +- [ ] manpage-schema-v2.0.md +- [ ] schema-schema-v1.0.md + +### Registry +- [ ] Clean schema database +- [ ] Updated schema-catalog.yaml +- [ ] No duplicates + +## Next Steps + +1. **Review this workplan** - Get approval +2. **Phase 0** - Complete planning artifacts +3. **Phase 1** - Implement naming validation +4. **Checkpoint** - Review progress after Phase 1 +5. **Continue** - Execute remaining phases + +## Approval + +- [ ] Workplan reviewed +- [ ] Approach approved +- [ ] Ready to begin implementation + +--- + +**Status:** Awaiting approval +**Next Action:** Complete Phase 0 planning artifacts