feat: add manpages example demonstrating schema validation
Add comprehensive example showcasing schema validation with self-documenting manpage system: - markdown-manpage-schema.json: Reusable schema for Unix manpage structure - markdown-schema-validation.1.md: Complete manual about schema validation - README.md: Usage guide, integration examples, and best practices - SCHEMA_EVOLUTION_WORKPLAN.md: Roadmap for enhanced schema system The manual validates against its own schema, demonstrating dogfooding principle. Workplan outlines 5-phase evolution from rigid structural validation to flexible content control with blueprints. Key features demonstrated: - Schema-driven documentation structure - Self-validating documentation - Reusable validation patterns - Classification system design (required/recommended/optional/discouraged/improper) This sets foundation for Phase 1 implementation: enhanced schema format with section classification and content control. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
This commit is contained in:
4
TODO.md
4
TODO.md
@@ -12,6 +12,8 @@ The structure organizes **future tasks** by their impact, just as a changelog or
|
||||
|
||||
This section is for tasks currently being discussed with or worked on by the coding assistant. These are the ephemeral, flow-of-thought tasks.
|
||||
|
||||
0. the file TODO.html is legacy i think and can be removed
|
||||
|
||||
### Extract Capability-Capability from Issue-Facade
|
||||
|
||||
**Context:** Issue-facade currently provides two capabilities:
|
||||
@@ -81,4 +83,4 @@ The **capability-capability** includes:
|
||||
- ✅ Refactored to family-based directory structure (_issue-tracking/issue-facade)
|
||||
- ✅ Made feedback directory visible (feedback/ not .feedback/)
|
||||
- ✅ Renamed to explicit family declaration (CAPABILITY-issue-tracking.yaml)
|
||||
- ✅ Created CHANGELOG.md documenting v1.0.0
|
||||
- ✅ Created CHANGELOG.md documenting v1.0.0
|
||||
|
||||
388
examples/manpages/README.md
Normal file
388
examples/manpages/README.md
Normal file
@@ -0,0 +1,388 @@
|
||||
# Unix Manpage Schema Validation Example
|
||||
|
||||
This example demonstrates MarkiTect's schema validation system by creating a self-validating documentation set: a schema that defines Unix manpage structure and a comprehensive manual about schema validation that validates against its own schema definition.
|
||||
|
||||
## Overview
|
||||
|
||||
This example showcases the "dogfooding" principle - using MarkiTect's schema validation to document schema validation itself. It demonstrates:
|
||||
|
||||
- **Schema-driven documentation** - Defining document structure with JSON Schema
|
||||
- **Self-validation** - The manual validates against the manpage schema it demonstrates
|
||||
- **Reusable patterns** - The manpage schema can validate any Unix-style manual page
|
||||
- **Complete workflow** - From schema creation through validation and refinement
|
||||
|
||||
## Files in This Example
|
||||
|
||||
### `markdown-manpage-schema.json`
|
||||
|
||||
A JSON Schema defining the structure of Unix-style manual pages written in Markdown.
|
||||
|
||||
**Key Features:**
|
||||
- Validates H1 title format: `command(section) - description`
|
||||
- Requires SYNOPSIS and DESCRIPTION sections
|
||||
- Validates heading hierarchy (H1, H2, H3, H4)
|
||||
- Ensures presence of code examples, paragraphs, and emphasis
|
||||
- Includes custom `x-markitect-*` extensions for manpage conventions
|
||||
|
||||
**Schema Requirements:**
|
||||
- Exactly 1 H1 heading (document title)
|
||||
- 3-30 H2 headings (major sections)
|
||||
- 0-50 H3 headings (subsections)
|
||||
- 5-500 paragraphs (content)
|
||||
- 1-50 code blocks (examples)
|
||||
- 10-500 emphasis elements (commands/arguments)
|
||||
|
||||
### `markdown-schema-validation.1.md`
|
||||
|
||||
A comprehensive manual page (section 7) documenting MarkiTect's markdown schema validation system.
|
||||
|
||||
**Sections Include:**
|
||||
- SYNOPSIS - Command syntax reference
|
||||
- DESCRIPTION - How schema validation works
|
||||
- SCHEMA STRUCTURE - JSON Schema format details
|
||||
- COMMANDS - Schema management and validation commands
|
||||
- WORKFLOW - Step-by-step validation workflows
|
||||
- VALIDATION RULES - What schemas validate
|
||||
- ERROR HANDLING - Understanding validation errors
|
||||
- SCHEMA DESIGN - Best practices and anti-patterns
|
||||
- INTEGRATION - CI/CD, git hooks, build systems
|
||||
- EXAMPLES - Practical usage demonstrations
|
||||
- Plus standard manpage sections: FILES, EXIT STATUS, ENVIRONMENT, SEE ALSO, etc.
|
||||
|
||||
**Statistics:**
|
||||
- 19 H2 sections
|
||||
- 24 H3 subsections
|
||||
- 147 paragraphs
|
||||
- 23 code examples
|
||||
- 105 emphasis markers
|
||||
|
||||
## Running the Example
|
||||
|
||||
### 1. Validate the Manual Against the Schema
|
||||
|
||||
Verify that the manual conforms to the manpage schema:
|
||||
|
||||
```bash
|
||||
cd examples/manpages
|
||||
|
||||
markitect validate markdown-schema-validation.1.md \
|
||||
--schema markdown-manpage-schema.json
|
||||
```
|
||||
|
||||
Expected output: ✅ **VALID** - Document structure matches schema requirements
|
||||
|
||||
### 2. Show Detailed Validation
|
||||
|
||||
See detailed validation information:
|
||||
|
||||
```bash
|
||||
markitect validate markdown-schema-validation.1.md \
|
||||
--schema markdown-manpage-schema.json \
|
||||
--detailed-errors
|
||||
```
|
||||
|
||||
### 3. Generate Schema from the Manual
|
||||
|
||||
Analyze the manual's actual structure:
|
||||
|
||||
```bash
|
||||
markitect schema-generate markdown-schema-validation.1.md \
|
||||
--output actual-structure-schema.json
|
||||
|
||||
cat actual-structure-schema.json
|
||||
```
|
||||
|
||||
Compare the generated schema with the manpage schema to see how the manual conforms.
|
||||
|
||||
### 4. Examine AST Structure
|
||||
|
||||
View the parsed structure of the manual:
|
||||
|
||||
```bash
|
||||
markitect ast-show markdown-schema-validation.1.md --format tree
|
||||
```
|
||||
|
||||
Or in compact format:
|
||||
|
||||
```bash
|
||||
markitect ast-show markdown-schema-validation.1.md --format compact | head -50
|
||||
```
|
||||
|
||||
### 5. Store Schema for Reuse
|
||||
|
||||
Add the manpage schema to MarkiTect's database:
|
||||
|
||||
```bash
|
||||
markitect schema-ingest markdown-manpage-schema.json
|
||||
markitect schema-list
|
||||
```
|
||||
|
||||
### 6. Validate Other Manpages
|
||||
|
||||
Use the schema to validate other manual pages in the project:
|
||||
|
||||
```bash
|
||||
markitect validate ../../docs/manuals/markitect.1.md \
|
||||
--schema markdown-manpage-schema.json
|
||||
|
||||
markitect validate ../../docs/manuals/issue.1.md \
|
||||
--schema markdown-manpage-schema.json
|
||||
```
|
||||
|
||||
### 7. Generate Manpage Template
|
||||
|
||||
Create a template for new manpages:
|
||||
|
||||
```bash
|
||||
markitect generate-stub markdown-manpage-schema.json \
|
||||
--output new-manpage-template.md
|
||||
|
||||
cat new-manpage-template.md
|
||||
```
|
||||
|
||||
## What This Example Demonstrates
|
||||
|
||||
### 1. Schema-Driven Documentation
|
||||
|
||||
The manpage schema defines what a valid Unix manual page looks like:
|
||||
|
||||
- Required structural elements (title, synopsis, description)
|
||||
- Heading hierarchy constraints
|
||||
- Content density requirements (minimum paragraphs, code examples)
|
||||
- Formatting conventions (bold commands, italic arguments)
|
||||
|
||||
### 2. Self-Validating System
|
||||
|
||||
The schema validation manual validates against the manpage schema, proving:
|
||||
|
||||
- The schema is practical and usable
|
||||
- The manual follows manpage conventions
|
||||
- Schema validation works as documented
|
||||
- The system is reliable enough to document itself
|
||||
|
||||
### 3. Structural vs Semantic Validation
|
||||
|
||||
The schema validates **structure**, not **content**:
|
||||
|
||||
- ✅ Validates: Correct number of sections, heading levels, code examples present
|
||||
- ❌ Does not validate: Grammar, code correctness, factual accuracy, logical flow
|
||||
|
||||
This distinction is crucial for understanding what schemas can and cannot do.
|
||||
|
||||
### 4. Reusable Patterns
|
||||
|
||||
The manpage schema is a reusable pattern that can:
|
||||
|
||||
- Validate any Unix-style manual page
|
||||
- Enforce documentation consistency across a project
|
||||
- Generate templates for new documentation
|
||||
- Integrate into CI/CD pipelines for quality checks
|
||||
|
||||
### 5. Custom Schema Extensions
|
||||
|
||||
The schema demonstrates MarkiTect's custom extensions:
|
||||
|
||||
```json
|
||||
"x-markitect-required-sections": [
|
||||
"SYNOPSIS",
|
||||
"DESCRIPTION"
|
||||
],
|
||||
"x-markitect-recommended-sections": [
|
||||
"OPTIONS",
|
||||
"EXAMPLES",
|
||||
"SEE ALSO"
|
||||
],
|
||||
"x-markitect-conventions": {
|
||||
"heading_case": "UPPERCASE for H2 sections",
|
||||
"command_format": "Bold with **command**",
|
||||
"argument_format": "Italic with *ARG*"
|
||||
}
|
||||
```
|
||||
|
||||
These extensions provide metadata about schema intent and conventions beyond structural validation.
|
||||
|
||||
## Validation Workflow Demonstrated
|
||||
|
||||
This example shows the complete schema validation workflow:
|
||||
|
||||
### Step 1: Schema Creation
|
||||
- Analyze existing manpages (markitect.1.md, issue.1.md)
|
||||
- Identify common structural patterns
|
||||
- Generate base schema from example document
|
||||
- Refine schema to be flexible yet meaningful
|
||||
|
||||
### Step 2: Schema Refinement
|
||||
- Adjust minItems/maxItems for appropriate ranges
|
||||
- Add custom MarkiTect extensions
|
||||
- Include heading patterns and conventions
|
||||
- Balance strictness with flexibility
|
||||
|
||||
### Step 3: Document Creation
|
||||
- Write document following schema structure
|
||||
- Use template generated from schema as starting point
|
||||
- Ensure all required sections present
|
||||
- Include appropriate code examples and formatting
|
||||
|
||||
### Step 4: Validation
|
||||
- Validate document against schema
|
||||
- Review validation errors if any
|
||||
- Fix structural issues
|
||||
- Re-validate until passing
|
||||
|
||||
### Step 5: Iteration
|
||||
- Refine schema based on validation experience
|
||||
- Adjust constraints for real-world use cases
|
||||
- Document lessons learned
|
||||
- Share schema for reuse
|
||||
|
||||
## Integration Examples
|
||||
|
||||
### CI/CD Integration
|
||||
|
||||
Add to `.github/workflows/docs.yml` or similar:
|
||||
|
||||
```yaml
|
||||
- name: Validate Manpages
|
||||
run: |
|
||||
for manpage in docs/manuals/*.md; do
|
||||
markitect validate "$manpage" \
|
||||
--schema examples/manpages/markdown-manpage-schema.json \
|
||||
|| exit 1
|
||||
done
|
||||
```
|
||||
|
||||
### Pre-commit Hook
|
||||
|
||||
Add to `.git/hooks/pre-commit`:
|
||||
|
||||
```bash
|
||||
#!/bin/bash
|
||||
changed_manpages=$(git diff --cached --name-only --diff-filter=ACM | grep 'docs/manuals/.*\.md$')
|
||||
|
||||
for manpage in $changed_manpages; do
|
||||
markitect validate "$manpage" \
|
||||
--schema examples/manpages/markdown-manpage-schema.json \
|
||||
--quiet || {
|
||||
echo "Manpage validation failed: $manpage"
|
||||
markitect validate "$manpage" \
|
||||
--schema examples/manpages/markdown-manpage-schema.json \
|
||||
--detailed-errors
|
||||
exit 1
|
||||
}
|
||||
done
|
||||
```
|
||||
|
||||
### Makefile Integration
|
||||
|
||||
Add to project `Makefile`:
|
||||
|
||||
```makefile
|
||||
.PHONY: validate-manpages
|
||||
validate-manpages:
|
||||
@echo "Validating manual pages..."
|
||||
@for manpage in docs/manuals/*.md; do \
|
||||
markitect validate "$$manpage" \
|
||||
--schema examples/manpages/markdown-manpage-schema.json \
|
||||
|| exit 1; \
|
||||
done
|
||||
@echo "✅ All manpages valid"
|
||||
|
||||
.PHONY: docs
|
||||
docs: validate-manpages
|
||||
# Continue with doc generation...
|
||||
```
|
||||
|
||||
## Key Lessons from This Example
|
||||
|
||||
### 1. Start with Real Documents
|
||||
|
||||
The manpage schema was created by analyzing existing manpages (markitect.1.md, issue.1.md), not designed in isolation. This ensures the schema reflects real-world usage.
|
||||
|
||||
### 2. Use Ranges, Not Exact Counts
|
||||
|
||||
The schema uses ranges like `5-500 paragraphs` instead of exact counts. This provides flexibility while still enforcing quality standards.
|
||||
|
||||
### 3. Required vs Recommended
|
||||
|
||||
The schema distinguishes between required sections (SYNOPSIS, DESCRIPTION) and recommended sections (EXAMPLES, SEE ALSO), allowing flexibility where appropriate.
|
||||
|
||||
### 4. Validate Structure, Not Semantics
|
||||
|
||||
Schemas validate document structure, not content quality. Grammar checking, code correctness, and factual accuracy require other tools.
|
||||
|
||||
### 5. Progressive Refinement
|
||||
|
||||
Schemas should evolve based on validation experience. Start loose, tighten based on actual needs, never over-specify.
|
||||
|
||||
### 6. Documentation is Essential
|
||||
|
||||
The schema includes extensive metadata about conventions and intent through custom extensions, making it self-documenting.
|
||||
|
||||
## Extending This Example
|
||||
|
||||
### Create Schema Variants
|
||||
|
||||
Create specialized schemas for different manpage types:
|
||||
|
||||
```bash
|
||||
# For command manpages (section 1)
|
||||
cp markdown-manpage-schema.json command-manpage-schema.json
|
||||
# Edit to require COMMANDS section
|
||||
|
||||
# For format manpages (section 5)
|
||||
cp markdown-manpage-schema.json format-manpage-schema.json
|
||||
# Edit to require FORMAT section
|
||||
|
||||
# For convention manpages (section 7)
|
||||
cp markdown-manpage-schema.json convention-manpage-schema.json
|
||||
# Edit to be more flexible
|
||||
```
|
||||
|
||||
### Validate Your Own Documentation
|
||||
|
||||
Apply the manpage schema to your project:
|
||||
|
||||
```bash
|
||||
# Validate README
|
||||
markitect validate README.md \
|
||||
--schema markdown-manpage-schema.json
|
||||
|
||||
# May need adjustments for non-manpage docs
|
||||
```
|
||||
|
||||
### Generate Schema Family
|
||||
|
||||
Create schemas for related document types:
|
||||
|
||||
- API documentation schema
|
||||
- Tutorial schema
|
||||
- RFC/specification schema
|
||||
- Architecture decision record (ADR) schema
|
||||
|
||||
Each can follow similar validation principles while enforcing type-specific structure.
|
||||
|
||||
## Further Reading
|
||||
|
||||
- **markdown-schema-validation.1.md** - Complete reference for schema validation
|
||||
- **../../docs/manuals/markitect.1.md** - MarkiTect command reference
|
||||
- **JSON Schema Specification** - https://json-schema.org/
|
||||
- **Unix Manual Page Conventions** - `man 7 man-pages` on Unix systems
|
||||
|
||||
## Validation Results
|
||||
|
||||
This example has been validated to confirm:
|
||||
|
||||
✅ Manual validates against manpage schema
|
||||
✅ Schema is well-formed JSON Schema draft-07
|
||||
✅ All required sections present in manual
|
||||
✅ Heading hierarchy follows Unix conventions
|
||||
✅ Code examples demonstrate actual usage
|
||||
✅ Structure matches defined constraints
|
||||
|
||||
## License
|
||||
|
||||
Part of the MarkiTect project. Licensed under MIT License.
|
||||
|
||||
---
|
||||
|
||||
**Note**: This example represents a complete, production-ready use case of MarkiTect's schema validation system. The files can be used as-is or adapted for your own documentation requirements.
|
||||
787
examples/manpages/SCHEMA_EVOLUTION_WORKPLAN.md
Normal file
787
examples/manpages/SCHEMA_EVOLUTION_WORKPLAN.md
Normal file
@@ -0,0 +1,787 @@
|
||||
# MarkiTect Schema Evolution Workplan
|
||||
|
||||
## Executive Summary
|
||||
|
||||
**Current State**: MarkiTect validates document structure via JSON Schema, but is too rigid (exact counts) and structure-only (no content guidance).
|
||||
|
||||
**Target State**: A flexible schema system with content control, section classification, multi-schema conformance, and blueprint-based document generation.
|
||||
|
||||
**Timeline**: 5 phases, 15-20 development sessions, approximately 8-10 weeks.
|
||||
|
||||
---
|
||||
|
||||
## Problem Analysis
|
||||
|
||||
### Current Limitations
|
||||
|
||||
#### 1. Structural Rigidity
|
||||
**Problem**: Auto-generated schemas use exact counts
|
||||
```json
|
||||
"paragraphs": { "minItems": 86, "maxItems": 86 }
|
||||
```
|
||||
**Impact**: Schemas are document-specific, not reusable patterns.
|
||||
|
||||
#### 2. Binary Structure Validation
|
||||
**Problem**: Elements are either valid or invalid, no classification.
|
||||
**Need**: Required, Recommended, Optional, Discouraged, Improper classifications.
|
||||
|
||||
#### 3. No Content Guidance
|
||||
**Problem**: Schemas validate structure exists, not what content belongs there.
|
||||
**Need**: Content instructions, semantic patterns, quality expectations.
|
||||
|
||||
#### 4. Single Schema Limitation
|
||||
**Problem**: Documents can only conform to one schema.
|
||||
**Need**: Multi-schema conformance (e.g., "manpage" + "API reference" + "tutorial").
|
||||
|
||||
#### 5. Template Generation Gap
|
||||
**Problem**: `generate-stub` creates outline, but no content guidance or data binding.
|
||||
**Need**: Blueprint system with content instructions and data templates.
|
||||
|
||||
---
|
||||
|
||||
## Proposed Architecture
|
||||
|
||||
### Three-Layer System
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────┐
|
||||
│ BLUEPRINT LAYER │
|
||||
│ (Multi-schema + Content + Data Templates) │
|
||||
└─────────────────────────────────────────────┘
|
||||
↓
|
||||
┌─────────────────────────────────────────────┐
|
||||
│ SCHEMA LAYER (Enhanced) │
|
||||
│ (Structure + Classification + Instructions) │
|
||||
└─────────────────────────────────────────────┘
|
||||
↓
|
||||
┌─────────────────────────────────────────────┐
|
||||
│ VALIDATION LAYER │
|
||||
│ (AST Validation + Content Analysis) │
|
||||
└─────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
### Key Concepts
|
||||
|
||||
**1. Schema Classification System**
|
||||
- **Required**: Must be present, validation fails if missing
|
||||
- **Recommended**: Should be present, warning if missing
|
||||
- **Optional**: May be present, no validation impact
|
||||
- **Discouraged**: Should not be present, warning if present
|
||||
- **Improper**: Must not be present, validation fails if present
|
||||
|
||||
**2. Content Control**
|
||||
- **Content Instructions**: Human-readable guidance for section content
|
||||
- **Content Patterns**: Regex/template patterns for content validation
|
||||
- **Content Quality Metrics**: Word count, readability, completeness scoring
|
||||
|
||||
**3. Multi-Schema Conformance**
|
||||
- Documents can conform to multiple schemas simultaneously
|
||||
- Schema composition and inheritance
|
||||
- Conflict resolution strategies
|
||||
|
||||
**4. Blueprint System**
|
||||
- Schemas + Instructions + Data Templates = Blueprints
|
||||
- Blueprints generate documents with content guidance
|
||||
- Data binding for dynamic document generation
|
||||
|
||||
---
|
||||
|
||||
## Phase 1: Enhanced Schema Format
|
||||
|
||||
**Goal**: Extend JSON Schema with MarkiTect-specific content control extensions.
|
||||
|
||||
### 1.1 Schema Classification Extensions
|
||||
|
||||
**New Properties**:
|
||||
```json
|
||||
{
|
||||
"x-markitect-sections": {
|
||||
"SYNOPSIS": {
|
||||
"classification": "required",
|
||||
"heading_level": 2,
|
||||
"position": "after_title",
|
||||
"content_instruction": "Brief command syntax showing all options",
|
||||
"min_code_blocks": 1,
|
||||
"max_code_blocks": 3
|
||||
},
|
||||
"EXAMPLES": {
|
||||
"classification": "recommended",
|
||||
"heading_level": 2,
|
||||
"content_instruction": "Practical usage examples with explanations",
|
||||
"min_code_blocks": 3,
|
||||
"warning_if_missing": "Examples greatly improve documentation usability"
|
||||
},
|
||||
"DEPRECATED": {
|
||||
"classification": "discouraged",
|
||||
"heading_level": 2,
|
||||
"warning_message": "DEPRECATED sections should be moved to historical docs"
|
||||
},
|
||||
"INTERNAL_NOTES": {
|
||||
"classification": "improper",
|
||||
"heading_level": 2,
|
||||
"error_message": "Internal notes must not appear in published documentation"
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### 1.2 Content Control Extensions
|
||||
|
||||
**New Properties**:
|
||||
```json
|
||||
{
|
||||
"x-markitect-content-control": {
|
||||
"synopsis_section": {
|
||||
"min_paragraphs": 1,
|
||||
"max_paragraphs": 3,
|
||||
"required_patterns": [
|
||||
"\\*\\*[a-z-]+\\*\\*.*\\[.*\\]" // Bold command with args
|
||||
],
|
||||
"content_quality": {
|
||||
"min_words": 10,
|
||||
"max_words": 100,
|
||||
"readability_target": "technical"
|
||||
},
|
||||
"content_instructions": [
|
||||
"Show command name in bold",
|
||||
"Include all major options in synopsis",
|
||||
"Use italic for arguments and placeholders"
|
||||
]
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### 1.3 Flexible Structure Constraints
|
||||
|
||||
**Replace rigid counts with ranges and classifications**:
|
||||
```json
|
||||
{
|
||||
"properties": {
|
||||
"headings": {
|
||||
"properties": {
|
||||
"level_2": {
|
||||
"items": {
|
||||
"properties": {
|
||||
"content": {
|
||||
"oneOf": [
|
||||
{"const": "SYNOPSIS", "x-markitect-classification": "required"},
|
||||
{"const": "DESCRIPTION", "x-markitect-classification": "required"},
|
||||
{"const": "EXAMPLES", "x-markitect-classification": "recommended"},
|
||||
{"const": "SEE ALSO", "x-markitect-classification": "optional"}
|
||||
]
|
||||
}
|
||||
}
|
||||
},
|
||||
"minItems": 2, // At least required sections
|
||||
"maxItems": 30 // Reasonable upper bound
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Tasks
|
||||
|
||||
- [ ] **Task 1.1**: Define `x-markitect-sections` schema extension format
|
||||
- [ ] **Task 1.2**: Define `x-markitect-content-control` schema extension format
|
||||
- [ ] **Task 1.3**: Update metaschema to validate new extensions
|
||||
- [ ] **Task 1.4**: Create schema examples demonstrating all classifications
|
||||
- [ ] **Task 1.5**: Document schema extension format
|
||||
|
||||
**Duration**: 3-4 sessions
|
||||
**Dependencies**: None
|
||||
**Deliverables**: Enhanced schema format specification, updated metaschema
|
||||
|
||||
---
|
||||
|
||||
## Phase 2: Schema Refinement Tools
|
||||
|
||||
**Goal**: Tools to transform rigid auto-generated schemas into flexible, classified schemas.
|
||||
|
||||
### 2.1 Schema Analysis Tool
|
||||
|
||||
**Command**: `markitect schema-analyze`
|
||||
|
||||
Analyzes existing schema and suggests improvements:
|
||||
```bash
|
||||
markitect schema-analyze rigid-schema.json
|
||||
|
||||
# Output:
|
||||
⚠️ Exact counts detected (86 paragraphs)
|
||||
Suggestion: Use range 50-150 for flexibility
|
||||
|
||||
⚠️ All sections unclassified
|
||||
Suggestion: Classify sections as required/recommended/optional
|
||||
|
||||
⚠️ No content instructions
|
||||
Suggestion: Add content guidance for key sections
|
||||
|
||||
✨ Run: markitect schema-refine rigid-schema.json
|
||||
```
|
||||
|
||||
### 2.2 Schema Refinement Tool
|
||||
|
||||
**Command**: `markitect schema-refine`
|
||||
|
||||
Interactive or automated schema refinement:
|
||||
```bash
|
||||
# Automated: Apply common refinements
|
||||
markitect schema-refine rigid-schema.json \
|
||||
--loosen-counts \
|
||||
--add-classifications \
|
||||
--output flexible-schema.json
|
||||
|
||||
# Interactive: Guided refinement
|
||||
markitect schema-refine rigid-schema.json --interactive
|
||||
```
|
||||
|
||||
**Refinement Operations**:
|
||||
- Convert exact counts to ranges (configurable tolerance)
|
||||
- Classify sections based on conventions
|
||||
- Add content instructions from templates
|
||||
- Merge multiple schemas for common patterns
|
||||
|
||||
### 2.3 Schema Composition Tool
|
||||
|
||||
**Command**: `markitect schema-compose`
|
||||
|
||||
Combine multiple schemas:
|
||||
```bash
|
||||
# Create composite schema
|
||||
markitect schema-compose \
|
||||
--base manpage-schema.json \
|
||||
--extend api-reference-schema.json \
|
||||
--extend tutorial-schema.json \
|
||||
--output composite-schema.json
|
||||
```
|
||||
|
||||
### Tasks
|
||||
|
||||
- [ ] **Task 2.1**: Implement `schema-analyze` command
|
||||
- [ ] **Task 2.2**: Implement `schema-refine` command with loosening logic
|
||||
- [ ] **Task 2.3**: Implement `schema-refine --interactive` mode
|
||||
- [ ] **Task 2.4**: Implement `schema-compose` command
|
||||
- [ ] **Task 2.5**: Create schema refinement rule library
|
||||
|
||||
**Duration**: 3-4 sessions
|
||||
**Dependencies**: Phase 1 complete
|
||||
**Deliverables**: Schema analysis, refinement, and composition tools
|
||||
|
||||
---
|
||||
|
||||
## Phase 3: Enhanced Validation Engine
|
||||
|
||||
**Goal**: Validate classification levels, content patterns, and multi-schema conformance.
|
||||
|
||||
### 3.1 Classification-Aware Validation
|
||||
|
||||
**Validation Levels**:
|
||||
```python
|
||||
class ValidationResult:
|
||||
status: Literal["valid", "valid_with_warnings", "invalid"]
|
||||
errors: List[ValidationError] # Required/Improper violations
|
||||
warnings: List[ValidationWarning] # Recommended/Discouraged violations
|
||||
suggestions: List[str] # Optional improvements
|
||||
```
|
||||
|
||||
**Example Output**:
|
||||
```bash
|
||||
markitect validate document.md schema.json --detailed-errors
|
||||
|
||||
❌ ERRORS (validation failed)
|
||||
- Missing required section: SYNOPSIS
|
||||
- Improper section present: INTERNAL_NOTES
|
||||
|
||||
⚠️ WARNINGS
|
||||
- Missing recommended section: EXAMPLES
|
||||
- Discouraged section present: DEPRECATED
|
||||
|
||||
💡 SUGGESTIONS
|
||||
- Consider adding optional section: PERFORMANCE
|
||||
- Content quality: DESCRIPTION section below recommended word count (45/100)
|
||||
|
||||
Status: INVALID (2 errors, 2 warnings)
|
||||
```
|
||||
|
||||
### 3.2 Content Pattern Validation
|
||||
|
||||
**Validate content patterns**:
|
||||
```python
|
||||
# Schema specifies required patterns
|
||||
"synopsis_section": {
|
||||
"required_patterns": [
|
||||
r"\*\*command\*\*", # Bold command name
|
||||
r"\[.*\]" # Options in brackets
|
||||
],
|
||||
"discouraged_patterns": [
|
||||
r"TODO", # No TODOs in published docs
|
||||
r"FIXME"
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
### 3.3 Multi-Schema Validation
|
||||
|
||||
**Command**: `markitect validate --schemas`
|
||||
|
||||
```bash
|
||||
# Validate against multiple schemas
|
||||
markitect validate api-doc.md \
|
||||
--schemas manpage.json,api-reference.json,tutorial.json \
|
||||
--require-all
|
||||
|
||||
# Output shows conformance to each schema
|
||||
✅ manpage.json: VALID
|
||||
✅ api-reference.json: VALID (2 warnings)
|
||||
❌ tutorial.json: INVALID (missing required section: GETTING STARTED)
|
||||
|
||||
Overall: INVALID (must conform to all schemas)
|
||||
```
|
||||
|
||||
### 3.4 Content Quality Metrics
|
||||
|
||||
**Validate content quality**:
|
||||
```bash
|
||||
markitect validate document.md schema.json --quality-check
|
||||
|
||||
📊 Content Quality Report
|
||||
- Word count: 487 (target: 300-1000) ✅
|
||||
- Code examples: 3 (minimum: 3) ✅
|
||||
- Readability: Technical (appropriate) ✅
|
||||
- Link validity: 12/12 valid ✅
|
||||
- Heading hierarchy: Valid ✅
|
||||
|
||||
Quality Score: 95/100
|
||||
```
|
||||
|
||||
### Tasks
|
||||
|
||||
- [ ] **Task 3.1**: Implement classification-aware validator
|
||||
- [ ] **Task 3.2**: Implement content pattern validation
|
||||
- [ ] **Task 3.3**: Implement multi-schema validation
|
||||
- [ ] **Task 3.4**: Implement content quality metrics
|
||||
- [ ] **Task 3.5**: Enhanced error reporting with suggestions
|
||||
|
||||
**Duration**: 4-5 sessions
|
||||
**Dependencies**: Phase 1 complete
|
||||
**Deliverables**: Enhanced validation engine, quality metrics
|
||||
|
||||
---
|
||||
|
||||
## Phase 4: Blueprint System
|
||||
|
||||
**Goal**: Document generation system with schemas + content instructions + data templates.
|
||||
|
||||
### 4.1 Blueprint Format
|
||||
|
||||
**Blueprint Structure**:
|
||||
```json
|
||||
{
|
||||
"$blueprint": "1.0",
|
||||
"name": "api-documentation-blueprint",
|
||||
"description": "Blueprint for API endpoint documentation",
|
||||
|
||||
"schemas": [
|
||||
"manpage-schema.json",
|
||||
"api-reference-schema.json"
|
||||
],
|
||||
|
||||
"content_model": {
|
||||
"synopsis": {
|
||||
"template": "**{{command}}** [*OPTIONS*] *{{primary_argument}}*",
|
||||
"data_source": "command_metadata.json",
|
||||
"instruction": "Brief command syntax"
|
||||
},
|
||||
"description": {
|
||||
"template": "{{description}}\n\nThis endpoint {{purpose}}.",
|
||||
"min_paragraphs": 2,
|
||||
"instruction": "Explain what the endpoint does and why to use it"
|
||||
},
|
||||
"parameters": {
|
||||
"template": "{{#each parameters}}\n**{{name}}** *{{type}}*\n: {{description}}\n{{/each}}",
|
||||
"data_source": "parameters",
|
||||
"instruction": "Document all parameters with types and descriptions"
|
||||
}
|
||||
},
|
||||
|
||||
"data_schema": {
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"command": {"type": "string"},
|
||||
"primary_argument": {"type": "string"},
|
||||
"description": {"type": "string"},
|
||||
"purpose": {"type": "string"},
|
||||
"parameters": {
|
||||
"type": "array",
|
||||
"items": {
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"name": {"type": "string"},
|
||||
"type": {"type": "string"},
|
||||
"description": {"type": "string"}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
},
|
||||
|
||||
"generation_rules": {
|
||||
"heading_style": "atx",
|
||||
"code_fence_style": "backticks",
|
||||
"line_length": 80,
|
||||
"include_metadata": true
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### 4.2 Blueprint Commands
|
||||
|
||||
**Create Blueprint**:
|
||||
```bash
|
||||
# From existing schema
|
||||
markitect blueprint-create --from-schema api-schema.json \
|
||||
--output api-blueprint.json
|
||||
|
||||
# Interactive creation
|
||||
markitect blueprint-create --interactive
|
||||
```
|
||||
|
||||
**Generate from Blueprint**:
|
||||
```bash
|
||||
# Generate with data file
|
||||
markitect blueprint-generate api-blueprint.json \
|
||||
--data endpoint-data.json \
|
||||
--output api-doc.md
|
||||
|
||||
# Generate with inline data
|
||||
markitect blueprint-generate api-blueprint.json \
|
||||
--data '{"command": "api-call", "description": "Make API call"}' \
|
||||
--output api-doc.md
|
||||
|
||||
# Batch generation
|
||||
markitect blueprint-generate-batch api-blueprint.json \
|
||||
--data-dir ./endpoints/ \
|
||||
--output-dir ./docs/api/
|
||||
```
|
||||
|
||||
**Validate Blueprint**:
|
||||
```bash
|
||||
# Validate blueprint format
|
||||
markitect blueprint-validate api-blueprint.json
|
||||
|
||||
# Test blueprint generation
|
||||
markitect blueprint-test api-blueprint.json \
|
||||
--sample-data test-data.json
|
||||
```
|
||||
|
||||
### 4.3 Template Engine Integration
|
||||
|
||||
**Handlebars-style templates with MarkiTect extensions**:
|
||||
```markdown
|
||||
# {{command}}(1) - {{title}}
|
||||
|
||||
## SYNOPSIS
|
||||
|
||||
**{{command}}** {{#each options}}[*{{this}}*] {{/each}}*{{argument}}*
|
||||
|
||||
## DESCRIPTION
|
||||
|
||||
{{description}}
|
||||
|
||||
{{#markitect-section "technical-details"}}
|
||||
Technical implementation details for {{command}}.
|
||||
{{/markitect-section}}
|
||||
|
||||
## PARAMETERS
|
||||
|
||||
{{#each parameters}}
|
||||
**--{{name}}** *{{type}}*
|
||||
: {{description}}
|
||||
: {{#if default}}Default: `{{default}}`{{/if}}
|
||||
|
||||
{{/each}}
|
||||
|
||||
{{#markitect-code-block "bash"}}
|
||||
# Example usage
|
||||
{{command}} {{#each examples.[0].args}}{{this}} {{/each}}
|
||||
{{/markitect-code-block}}
|
||||
```
|
||||
|
||||
### Tasks
|
||||
|
||||
- [ ] **Task 4.1**: Define blueprint format specification
|
||||
- [ ] **Task 4.2**: Implement `blueprint-create` command
|
||||
- [ ] **Task 4.3**: Implement `blueprint-generate` command
|
||||
- [ ] **Task 4.4**: Implement template engine with Handlebars
|
||||
- [ ] **Task 4.5**: Implement `blueprint-validate` command
|
||||
- [ ] **Task 4.6**: Implement batch generation
|
||||
- [ ] **Task 4.7**: Create blueprint library (common patterns)
|
||||
|
||||
**Duration**: 5-6 sessions
|
||||
**Dependencies**: Phases 1 and 3 complete
|
||||
**Deliverables**: Blueprint system, template engine, generation commands
|
||||
|
||||
---
|
||||
|
||||
## Phase 5: Documentation and Integration
|
||||
|
||||
**Goal**: Comprehensive documentation, examples, and ecosystem integration.
|
||||
|
||||
### 5.1 Documentation Suite
|
||||
|
||||
**Documents to Create**:
|
||||
- [ ] Schema Evolution Guide (why and how)
|
||||
- [ ] Schema Classification Reference
|
||||
- [ ] Content Control Specification
|
||||
- [ ] Blueprint System Guide
|
||||
- [ ] Schema Design Best Practices
|
||||
- [ ] Migration Guide (old schemas → new format)
|
||||
- [ ] API Reference for programmatic usage
|
||||
|
||||
### 5.2 Example Gallery
|
||||
|
||||
**Create comprehensive examples**:
|
||||
- [ ] Manpage blueprint (already started)
|
||||
- [ ] API documentation blueprint
|
||||
- [ ] Tutorial document blueprint
|
||||
- [ ] Architecture Decision Record (ADR) blueprint
|
||||
- [ ] RFC/specification blueprint
|
||||
- [ ] Meeting notes blueprint
|
||||
- [ ] Project README blueprint
|
||||
|
||||
### 5.3 CLI Integration
|
||||
|
||||
**Update existing commands**:
|
||||
```bash
|
||||
# schema-generate with classification
|
||||
markitect schema-generate example.md \
|
||||
--classify-sections \
|
||||
--add-instructions \
|
||||
--flexible \
|
||||
--output smart-schema.json
|
||||
|
||||
# validate with multiple schemas
|
||||
markitect validate doc.md \
|
||||
--schemas schema1.json,schema2.json \
|
||||
--classification-aware \
|
||||
--quality-check
|
||||
|
||||
# generate-stub enhanced
|
||||
markitect generate-stub schema.json \
|
||||
--include-instructions \
|
||||
--sample-content \
|
||||
--output template.md
|
||||
```
|
||||
|
||||
### 5.4 CI/CD Integration Templates
|
||||
|
||||
**Provide ready-to-use integrations**:
|
||||
|
||||
GitHub Actions:
|
||||
```yaml
|
||||
- name: Validate Documentation
|
||||
uses: markitect/validate-action@v1
|
||||
with:
|
||||
schemas: docs/schemas/*.json
|
||||
files: docs/**/*.md
|
||||
classification-aware: true
|
||||
fail-on: errors
|
||||
warn-on: missing-recommended
|
||||
```
|
||||
|
||||
Pre-commit hook:
|
||||
```bash
|
||||
#!/bin/bash
|
||||
markitect validate-changed --schemas docs/schemas/ \
|
||||
--classification-aware \
|
||||
--fail-on errors
|
||||
```
|
||||
|
||||
### Tasks
|
||||
|
||||
- [ ] **Task 5.1**: Write comprehensive documentation suite
|
||||
- [ ] **Task 5.2**: Create example gallery with 7+ blueprints
|
||||
- [ ] **Task 5.3**: Update all CLI commands for new features
|
||||
- [ ] **Task 5.4**: Create CI/CD integration templates
|
||||
- [ ] **Task 5.5**: Write migration guide for existing schemas
|
||||
- [ ] **Task 5.6**: Create video tutorials/screencasts
|
||||
|
||||
**Duration**: 3-4 sessions
|
||||
**Dependencies**: All previous phases complete
|
||||
**Deliverables**: Complete documentation, examples, integrations
|
||||
|
||||
---
|
||||
|
||||
## Implementation Strategy
|
||||
|
||||
### Development Approach
|
||||
|
||||
**1. Test-Driven Development**
|
||||
- Write tests for each classification level
|
||||
- Test schema refinement transformations
|
||||
- Test blueprint generation with various data
|
||||
- Test multi-schema validation
|
||||
|
||||
**2. Backward Compatibility**
|
||||
- Existing schemas continue to work
|
||||
- New features are opt-in via extensions
|
||||
- Clear migration path documented
|
||||
|
||||
**3. Incremental Rollout**
|
||||
- Phase 1: Can be used immediately after completion
|
||||
- Each phase delivers user value independently
|
||||
- Later phases build on earlier phases
|
||||
|
||||
**4. Community Feedback**
|
||||
- Alpha release after Phase 1
|
||||
- Beta release after Phase 3
|
||||
- Stable release after Phase 5
|
||||
|
||||
### Technical Considerations
|
||||
|
||||
**Schema Format**:
|
||||
- JSON Schema draft-07 as foundation
|
||||
- MarkiTect extensions namespaced with `x-markitect-`
|
||||
- Validation via metaschema
|
||||
- Clear upgrade path to future JSON Schema versions
|
||||
|
||||
**Performance**:
|
||||
- Cache compiled schemas
|
||||
- Lazy validation for large documents
|
||||
- Parallel validation for multiple schemas
|
||||
- Optimize content pattern matching
|
||||
|
||||
**API Design**:
|
||||
- Programmatic access to all features
|
||||
- Python API for schema manipulation
|
||||
- Plugin system for custom validators
|
||||
- Extensible template engine
|
||||
|
||||
---
|
||||
|
||||
## Success Metrics
|
||||
|
||||
### Phase 1 Success
|
||||
- ✅ Schema with all 5 classifications validates correctly
|
||||
- ✅ Content instructions appear in generated stubs
|
||||
- ✅ Metaschema validates all extension formats
|
||||
|
||||
### Phase 2 Success
|
||||
- ✅ Rigid schema refined to flexible schema automatically
|
||||
- ✅ Multiple schemas composed without conflicts
|
||||
- ✅ Interactive refinement completes end-to-end
|
||||
|
||||
### Phase 3 Success
|
||||
- ✅ Validation distinguishes errors from warnings
|
||||
- ✅ Content patterns detected and reported
|
||||
- ✅ Multi-schema validation works with 3+ schemas
|
||||
- ✅ Quality metrics provide actionable feedback
|
||||
|
||||
### Phase 4 Success
|
||||
- ✅ Blueprint generates valid document from data
|
||||
- ✅ Generated document validates against source schemas
|
||||
- ✅ Batch generation processes 100+ documents
|
||||
- ✅ Template engine supports complex logic
|
||||
|
||||
### Phase 5 Success
|
||||
- ✅ Documentation covers all features
|
||||
- ✅ 7+ working blueprint examples
|
||||
- ✅ CI/CD integrations work in real projects
|
||||
- ✅ Migration guide successfully upgrades old schemas
|
||||
|
||||
---
|
||||
|
||||
## Risk Assessment
|
||||
|
||||
### Technical Risks
|
||||
|
||||
**Risk**: Schema format complexity
|
||||
**Mitigation**: Clear examples, validation tools, gradual adoption
|
||||
|
||||
**Risk**: Performance degradation with complex schemas
|
||||
**Mitigation**: Caching, optimization, benchmarking
|
||||
|
||||
**Risk**: Template engine security (code injection)
|
||||
**Mitigation**: Sandboxed execution, no eval, strict parsing
|
||||
|
||||
### Adoption Risks
|
||||
|
||||
**Risk**: Breaking changes to existing workflows
|
||||
**Mitigation**: Full backward compatibility, opt-in features
|
||||
|
||||
**Risk**: Learning curve for new features
|
||||
**Mitigation**: Excellent documentation, examples, tutorials
|
||||
|
||||
**Risk**: Feature bloat
|
||||
**Mitigation**: Keep core simple, advanced features optional
|
||||
|
||||
---
|
||||
|
||||
## Future Enhancements (Post-MVP)
|
||||
|
||||
### Potential Future Features
|
||||
|
||||
**1. Semantic Validation**
|
||||
- AI-powered content quality checking
|
||||
- Grammar and style validation
|
||||
- Factual consistency checking
|
||||
- Link and reference validation
|
||||
|
||||
**2. Visual Schema Editor**
|
||||
- Web-based GUI for schema creation
|
||||
- Visual blueprint designer
|
||||
- Live preview of generated documents
|
||||
- Drag-and-drop section arrangement
|
||||
|
||||
**3. Schema Marketplace**
|
||||
- Community schema repository
|
||||
- Reusable blueprint library
|
||||
- Rating and reviews system
|
||||
- Version management
|
||||
|
||||
**4. Advanced Blueprint Features**
|
||||
- Conditional sections based on data
|
||||
- Dynamic schema selection
|
||||
- Multi-language support
|
||||
- Custom helper functions
|
||||
|
||||
**5. Integration Ecosystem**
|
||||
- IDE plugins (VS Code, JetBrains)
|
||||
- Documentation platforms (Read the Docs, Docusaurus)
|
||||
- CMS integrations (Contentful, Strapi)
|
||||
- Static site generators (Hugo, Jekyll)
|
||||
|
||||
---
|
||||
|
||||
## Conclusion
|
||||
|
||||
This workplan transforms MarkiTect from a structural validator to a comprehensive document control system:
|
||||
|
||||
**Current**: Rigid structure validation
|
||||
**Target**: Flexible content control with blueprints
|
||||
|
||||
**Key Improvements**:
|
||||
1. ✨ Classification system (required → improper)
|
||||
2. ✨ Content guidance and instructions
|
||||
3. ✨ Multi-schema conformance
|
||||
4. ✨ Blueprint-based generation
|
||||
5. ✨ Quality metrics and analysis
|
||||
|
||||
**Timeline**: ~8-10 weeks for full implementation
|
||||
**Value**: Complete CMS-like document control for markdown
|
||||
|
||||
The system remains true to MarkiTect's philosophy of treating markdown as structured data while adding the flexibility and guidance needed for real-world content management.
|
||||
|
||||
---
|
||||
|
||||
## Next Steps
|
||||
|
||||
1. **Review and refine** this workplan
|
||||
2. **Prioritize phases** based on user needs
|
||||
3. **Create detailed specifications** for Phase 1
|
||||
4. **Set up development environment** for new features
|
||||
5. **Begin implementation** with TDD approach
|
||||
|
||||
**First Implementation Task**: Define `x-markitect-sections` format specification
|
||||
126
examples/manpages/markdown-manpage-schema.json
Normal file
126
examples/manpages/markdown-manpage-schema.json
Normal file
@@ -0,0 +1,126 @@
|
||||
{
|
||||
"$schema": "http://json-schema.org/draft-07/schema#",
|
||||
"type": "object",
|
||||
"title": "Markdown Manpage Schema",
|
||||
"description": "JSON schema defining the structure of Unix-style manual pages written in Markdown. Compatible with man(1) section format and conventions.",
|
||||
"properties": {
|
||||
"headings": {
|
||||
"type": "object",
|
||||
"description": "Document heading structure following Unix manpage conventions",
|
||||
"properties": {
|
||||
"level_1": {
|
||||
"type": "array",
|
||||
"description": "Title heading: command(section) - brief description",
|
||||
"items": {
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"content": {
|
||||
"type": "string",
|
||||
"pattern": "^[a-z0-9-]+\\([0-9]\\) - .+",
|
||||
"description": "Must follow format: command(section) - description"
|
||||
},
|
||||
"level": {
|
||||
"type": "integer",
|
||||
"const": 1
|
||||
}
|
||||
},
|
||||
"required": ["content", "level"]
|
||||
},
|
||||
"minItems": 1,
|
||||
"maxItems": 1
|
||||
},
|
||||
"level_2": {
|
||||
"type": "array",
|
||||
"description": "Main section headings (SYNOPSIS, DESCRIPTION, etc.)",
|
||||
"items": {
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"content": {
|
||||
"type": "string",
|
||||
"description": "Section name in UPPERCASE"
|
||||
},
|
||||
"level": {
|
||||
"type": "integer",
|
||||
"const": 2
|
||||
}
|
||||
},
|
||||
"required": ["content", "level"]
|
||||
},
|
||||
"minItems": 3,
|
||||
"maxItems": 30
|
||||
},
|
||||
"level_3": {
|
||||
"type": "array",
|
||||
"description": "Subsection headings (optional, for grouping commands or options)",
|
||||
"items": {
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"content": {
|
||||
"type": "string"
|
||||
},
|
||||
"level": {
|
||||
"type": "integer",
|
||||
"const": 3
|
||||
}
|
||||
},
|
||||
"required": ["content", "level"]
|
||||
},
|
||||
"minItems": 0,
|
||||
"maxItems": 50
|
||||
}
|
||||
},
|
||||
"required": ["level_1", "level_2"]
|
||||
},
|
||||
"paragraphs": {
|
||||
"type": "array",
|
||||
"description": "Text paragraphs containing descriptions and explanations",
|
||||
"minItems": 5,
|
||||
"maxItems": 500
|
||||
},
|
||||
"lists": {
|
||||
"type": "array",
|
||||
"description": "Lists for options, examples, or structured information",
|
||||
"minItems": 0,
|
||||
"maxItems": 100
|
||||
},
|
||||
"code_blocks": {
|
||||
"type": "array",
|
||||
"description": "Code examples and command demonstrations",
|
||||
"minItems": 1,
|
||||
"maxItems": 50
|
||||
},
|
||||
"emphasis": {
|
||||
"type": "array",
|
||||
"description": "Bold and italic emphasis for commands, options, and arguments",
|
||||
"minItems": 10,
|
||||
"maxItems": 500
|
||||
}
|
||||
},
|
||||
"required": ["headings", "paragraphs", "code_blocks", "emphasis"],
|
||||
"x-markitect-required-sections": [
|
||||
"SYNOPSIS",
|
||||
"DESCRIPTION"
|
||||
],
|
||||
"x-markitect-recommended-sections": [
|
||||
"OPTIONS",
|
||||
"EXAMPLES",
|
||||
"SEE ALSO",
|
||||
"COPYRIGHT"
|
||||
],
|
||||
"x-markitect-optional-sections": [
|
||||
"COMMANDS",
|
||||
"CONFIGURATION",
|
||||
"FILES",
|
||||
"EXIT STATUS",
|
||||
"ENVIRONMENT",
|
||||
"BUGS",
|
||||
"AUTHORS"
|
||||
],
|
||||
"x-markitect-conventions": {
|
||||
"heading_case": "UPPERCASE for H2 sections",
|
||||
"command_format": "Bold with **command** for commands and options",
|
||||
"argument_format": "Italic with *ARG* for arguments and placeholders",
|
||||
"example_language": "bash for code blocks",
|
||||
"definition_lists": "Use bold followed by colon for FILES, EXIT STATUS, ENVIRONMENT sections"
|
||||
}
|
||||
}
|
||||
566
examples/manpages/markdown-schema-validation.1.md
Normal file
566
examples/manpages/markdown-schema-validation.1.md
Normal file
@@ -0,0 +1,566 @@
|
||||
# markdown-schema-validation(7) - Structured Document Validation with JSON Schema
|
||||
|
||||
## SYNOPSIS
|
||||
|
||||
**markitect schema-generate** *SOURCE_FILE* [**--output** *SCHEMA_FILE*]
|
||||
|
||||
**markitect schema-ingest** *SCHEMA_FILE*
|
||||
|
||||
**markitect validate** *DOCUMENT* *SCHEMA*
|
||||
|
||||
**markitect generate-stub** *SCHEMA* [**--output** *FILE*]
|
||||
|
||||
## DESCRIPTION
|
||||
|
||||
Markdown Schema Validation is MarkiTect's system for enforcing structural consistency in markdown documents. Unlike traditional markdown linters that check syntax, schema validation ensures documents conform to predefined structural patterns by validating their Abstract Syntax Tree (AST) representation against JSON Schema definitions.
|
||||
|
||||
This approach enables content management workflows where document structure is as important as content, making it ideal for technical documentation, business documents, and any scenario requiring consistent document templates.
|
||||
|
||||
### How Schema Validation Works
|
||||
|
||||
MarkiTect parses markdown files into an AST representation, then validates the AST structure against JSON schemas. The validation process checks:
|
||||
|
||||
- **Heading hierarchy** - Required heading levels and counts
|
||||
- **Content elements** - Minimum and maximum paragraph counts
|
||||
- **Structural patterns** - Presence of lists, code blocks, tables
|
||||
- **Section organization** - Required and optional document sections
|
||||
|
||||
Schemas validate structure, not semantics. A document can pass validation while containing incorrect content, as long as the structure matches the schema.
|
||||
|
||||
## SCHEMA STRUCTURE
|
||||
|
||||
### JSON Schema Format
|
||||
|
||||
MarkiTect schemas are standard JSON Schema (draft-07) documents with custom extensions for markdown-specific validation.
|
||||
|
||||
#### Standard Properties
|
||||
|
||||
**properties.headings**
|
||||
: Defines heading structure by level (level_1, level_2, level_3)
|
||||
: Each level specifies minItems, maxItems, and content patterns
|
||||
|
||||
**properties.paragraphs**
|
||||
: Array constraints for paragraph counts
|
||||
: Validates document length and content density
|
||||
|
||||
**properties.code_blocks**
|
||||
: Array constraints for code examples
|
||||
: Ensures technical documentation includes examples
|
||||
|
||||
**properties.lists**
|
||||
: Array constraints for list elements
|
||||
: Validates presence of structured information
|
||||
|
||||
**properties.emphasis**
|
||||
: Array constraints for bold and italic text
|
||||
: Ensures appropriate use of emphasis
|
||||
|
||||
#### MarkiTect Extensions
|
||||
|
||||
MarkiTect extends JSON Schema with custom properties prefixed with **x-markitect-**:
|
||||
|
||||
**x-markitect-required-sections**
|
||||
: Array of required H2 section names
|
||||
: Example: ["SYNOPSIS", "DESCRIPTION", "EXAMPLES"]
|
||||
|
||||
**x-markitect-recommended-sections**
|
||||
: Array of recommended but optional section names
|
||||
: Generates warnings when missing
|
||||
|
||||
**x-markitect-outline-mode**
|
||||
: Boolean enabling outline-only validation
|
||||
: Focuses on heading structure without content validation
|
||||
|
||||
**x-markitect-heading-text-capture**
|
||||
: Boolean enabling exact heading text validation
|
||||
: Enforces specific section names
|
||||
|
||||
## COMMANDS
|
||||
|
||||
### Schema Generation
|
||||
|
||||
**markitect schema-generate** *SOURCE_FILE*
|
||||
: Analyzes markdown file AST and generates JSON schema
|
||||
: Schema describes actual structure found in source document
|
||||
|
||||
**--output** *SCHEMA_FILE*
|
||||
: Write schema to file instead of stdout
|
||||
: Default: outputs to terminal
|
||||
|
||||
**--max-depth** *N*
|
||||
: Limit heading analysis to depth N
|
||||
: Useful for outline-focused schemas
|
||||
|
||||
### Schema Management
|
||||
|
||||
**markitect schema-ingest** *SCHEMA_FILE*
|
||||
: Store schema in MarkiTect database
|
||||
: Registers schema for reuse with validation commands
|
||||
|
||||
**markitect schema-list**
|
||||
: Display all stored schemas
|
||||
: Shows schema names and metadata
|
||||
|
||||
**markitect schema-get** *SCHEMA_NAME*
|
||||
: Retrieve stored schema
|
||||
: Outputs JSON schema to stdout
|
||||
|
||||
**markitect schema-delete** *SCHEMA_NAME*
|
||||
: Remove schema from database
|
||||
: Permanently deletes schema definition
|
||||
|
||||
### Document Validation
|
||||
|
||||
**markitect validate** *DOCUMENT* *SCHEMA*
|
||||
: Validate markdown document against schema
|
||||
: Returns exit code 0 for valid, 4 for invalid
|
||||
|
||||
**--detailed-errors**
|
||||
: Show detailed validation error messages
|
||||
: Includes suggestions for fixing violations
|
||||
|
||||
**--quiet**
|
||||
: Suppress output, exit code only
|
||||
: Useful for scripting and automation
|
||||
|
||||
### Template Generation
|
||||
|
||||
**markitect generate-stub** *SCHEMA*
|
||||
: Generate markdown template from schema
|
||||
: Creates document outline following schema structure
|
||||
|
||||
**--output** *FILE*
|
||||
: Write template to file
|
||||
: Default: outputs to stdout
|
||||
|
||||
## WORKFLOW
|
||||
|
||||
### Schema-Driven Development Workflow
|
||||
|
||||
The typical workflow for schema-based document management:
|
||||
|
||||
**1. Generate Schema from Example**
|
||||
|
||||
Create or identify an exemplar document with the desired structure, then generate its schema:
|
||||
|
||||
```bash
|
||||
markitect schema-generate exemplar.md --output doc-schema.json
|
||||
```
|
||||
|
||||
**2. Refine Schema**
|
||||
|
||||
Edit the generated schema to adjust constraints:
|
||||
|
||||
- Change minItems/maxItems for flexibility
|
||||
- Add required-sections extensions
|
||||
- Adjust heading patterns
|
||||
- Add content instructions
|
||||
|
||||
**3. Store Schema**
|
||||
|
||||
Register schema for reuse:
|
||||
|
||||
```bash
|
||||
markitect schema-ingest doc-schema.json
|
||||
```
|
||||
|
||||
**4. Generate Templates**
|
||||
|
||||
Create document templates from schema:
|
||||
|
||||
```bash
|
||||
markitect generate-stub doc-schema.json --output template.md
|
||||
```
|
||||
|
||||
**5. Create Documents**
|
||||
|
||||
Write new documents using template as starting point, or use existing documents.
|
||||
|
||||
**6. Validate Documents**
|
||||
|
||||
Ensure documents conform to schema:
|
||||
|
||||
```bash
|
||||
markitect validate new-document.md doc-schema.json
|
||||
|
||||
markitect validate new-document.md doc-schema.json --detailed-errors
|
||||
```
|
||||
|
||||
**7. Iterate**
|
||||
|
||||
Fix validation errors and re-validate until document passes.
|
||||
|
||||
### Batch Validation Workflow
|
||||
|
||||
For managing multiple documents:
|
||||
|
||||
```bash
|
||||
for doc in docs/*.md; do
|
||||
markitect validate "$doc" doc-schema.json --quiet || echo "Failed: $doc"
|
||||
done
|
||||
```
|
||||
|
||||
## VALIDATION RULES
|
||||
|
||||
### Heading Validation
|
||||
|
||||
Schemas validate heading structure through the **headings** property:
|
||||
|
||||
**level_1** headings must appear exactly once (document title)
|
||||
|
||||
**level_2** headings represent major sections (minItems/maxItems set bounds)
|
||||
|
||||
**level_3** headings provide subsections (often optional with minItems: 0)
|
||||
|
||||
Heading content can be validated with **pattern** or **enum** constraints for exact section names.
|
||||
|
||||
### Content Element Validation
|
||||
|
||||
**Paragraphs** - Validates document has sufficient descriptive content
|
||||
|
||||
**Code blocks** - Ensures technical documents include examples
|
||||
|
||||
**Lists** - Validates structured information presence
|
||||
|
||||
**Emphasis** - Checks for appropriate use of bold/italic formatting
|
||||
|
||||
Constraints use **minItems** and **maxItems** to set acceptable ranges.
|
||||
|
||||
### Metadata Validation
|
||||
|
||||
The **metadata** property validates overall document characteristics:
|
||||
|
||||
**total_elements** - Total AST node count
|
||||
|
||||
**structure_types** - Array of AST node types present
|
||||
|
||||
Use **const** for exact matches or ranges for flexibility.
|
||||
|
||||
## ERROR HANDLING
|
||||
|
||||
### Common Validation Errors
|
||||
|
||||
**Missing Required Section**
|
||||
|
||||
```
|
||||
Error: Required section 'SYNOPSIS' not found
|
||||
Suggestion: Add H2 heading '## SYNOPSIS' near document start
|
||||
```
|
||||
|
||||
**Insufficient Content**
|
||||
|
||||
```
|
||||
Error: Too few paragraphs (found 3, minimum 5 required)
|
||||
Suggestion: Add descriptive content to meet minimum paragraph count
|
||||
```
|
||||
|
||||
**Heading Count Mismatch**
|
||||
|
||||
```
|
||||
Error: Too many H2 headings (found 15, maximum 13 allowed)
|
||||
Suggestion: Combine related sections or adjust schema maxItems
|
||||
```
|
||||
|
||||
**Structure Type Mismatch**
|
||||
|
||||
```
|
||||
Error: Expected structure types not found: code_blocks
|
||||
Suggestion: Add code examples using fenced code blocks
|
||||
```
|
||||
|
||||
### Using Detailed Error Mode
|
||||
|
||||
Enable detailed errors for actionable feedback:
|
||||
|
||||
```bash
|
||||
markitect validate document.md schema.json --detailed-errors
|
||||
```
|
||||
|
||||
Output includes:
|
||||
- Specific constraint violations
|
||||
- Location information when available
|
||||
- Suggestions for fixes
|
||||
- Schema path to failing constraint
|
||||
|
||||
## SCHEMA DESIGN
|
||||
|
||||
### Best Practices
|
||||
|
||||
**Start with Real Documents**
|
||||
|
||||
Generate schemas from actual documents rather than writing from scratch. Real documents provide realistic constraints.
|
||||
|
||||
**Use Ranges, Not Exact Counts**
|
||||
|
||||
Allow flexibility with minItems/maxItems ranges:
|
||||
|
||||
```json
|
||||
"paragraphs": {
|
||||
"minItems": 10,
|
||||
"maxItems": 100
|
||||
}
|
||||
```
|
||||
|
||||
Avoid exact counts (**const**) unless structure is truly rigid.
|
||||
|
||||
**Required vs Optional Sections**
|
||||
|
||||
Use **x-markitect-required-sections** for essential sections like SYNOPSIS and DESCRIPTION.
|
||||
|
||||
Use **x-markitect-recommended-sections** for important but optional sections like EXAMPLES.
|
||||
|
||||
**Heading Patterns**
|
||||
|
||||
Use regex patterns for flexible heading validation:
|
||||
|
||||
```json
|
||||
"pattern": "^[A-Z][A-Z ]+$"
|
||||
```
|
||||
|
||||
Matches UPPERCASE section names while allowing variation.
|
||||
|
||||
**Progressive Refinement**
|
||||
|
||||
Start with loose constraints, tighten based on validation experience with real documents.
|
||||
|
||||
### Anti-Patterns
|
||||
|
||||
**Over-Specification**
|
||||
|
||||
Avoid schemas that are too specific:
|
||||
|
||||
```json
|
||||
"paragraphs": { "const": 47 }
|
||||
```
|
||||
|
||||
This requires exactly 47 paragraphs, which is too rigid for most use cases.
|
||||
|
||||
**Under-Specification**
|
||||
|
||||
Avoid schemas that validate nothing:
|
||||
|
||||
```json
|
||||
"paragraphs": { "minItems": 0 }
|
||||
```
|
||||
|
||||
Provide meaningful constraints that ensure document quality.
|
||||
|
||||
**Semantic Validation**
|
||||
|
||||
Schemas validate structure, not content. Don't expect schemas to validate:
|
||||
|
||||
- Correct grammar or spelling
|
||||
- Factual accuracy
|
||||
- Code correctness
|
||||
- Logical flow
|
||||
|
||||
Use other tools for semantic validation.
|
||||
|
||||
## INTEGRATION
|
||||
|
||||
### CI/CD Integration
|
||||
|
||||
Validate documentation in continuous integration:
|
||||
|
||||
```bash
|
||||
markitect validate README.md readme-schema.json --quiet
|
||||
exit_code=$?
|
||||
|
||||
if [ $exit_code -eq 0 ]; then
|
||||
echo "Documentation valid"
|
||||
else
|
||||
echo "Documentation validation failed"
|
||||
markitect validate README.md readme-schema.json --detailed-errors
|
||||
exit 1
|
||||
fi
|
||||
```
|
||||
|
||||
### Git Hooks
|
||||
|
||||
Pre-commit hook for automatic validation:
|
||||
|
||||
```bash
|
||||
changed_docs=$(git diff --cached --name-only --diff-filter=ACM | grep '.md$')
|
||||
|
||||
for doc in $changed_docs; do
|
||||
schema="${doc%.md}-schema.json"
|
||||
if [ -f "$schema" ]; then
|
||||
markitect validate "$doc" "$schema" || exit 1
|
||||
fi
|
||||
done
|
||||
```
|
||||
|
||||
### Build Systems
|
||||
|
||||
Makefile integration:
|
||||
|
||||
```makefile
|
||||
.PHONY: validate-docs
|
||||
validate-docs:
|
||||
@for doc in docs/*.md; do \
|
||||
markitect validate "$$doc" doc-schema.json || exit 1; \
|
||||
done
|
||||
|
||||
.PHONY: build
|
||||
build: validate-docs
|
||||
# Build process continues only if docs validate
|
||||
```
|
||||
|
||||
## EXAMPLES
|
||||
|
||||
### Generate Schema from Document
|
||||
|
||||
```bash
|
||||
markitect schema-generate examples/invoice.md --output invoice-schema.json
|
||||
```
|
||||
|
||||
### Store Schema for Reuse
|
||||
|
||||
```bash
|
||||
markitect schema-ingest invoice-schema.json
|
||||
markitect schema-list
|
||||
```
|
||||
|
||||
### Validate Single Document
|
||||
|
||||
```bash
|
||||
markitect validate draft-invoice.md invoice-schema.json
|
||||
|
||||
markitect validate draft-invoice.md invoice-schema.json --detailed-errors
|
||||
```
|
||||
|
||||
### Batch Validation
|
||||
|
||||
```bash
|
||||
for invoice in invoices/*.md; do
|
||||
markitect validate "$invoice" invoice-schema.json --quiet
|
||||
if [ $? -ne 0 ]; then
|
||||
echo "Invalid: $invoice"
|
||||
markitect validate "$invoice" invoice-schema.json --detailed-errors
|
||||
fi
|
||||
done
|
||||
```
|
||||
|
||||
### Template Generation
|
||||
|
||||
```bash
|
||||
markitect generate-stub invoice-schema.json --output new-invoice-template.md
|
||||
|
||||
cat new-invoice-template.md
|
||||
|
||||
markitect validate new-invoice-template.md invoice-schema.json
|
||||
```
|
||||
|
||||
### Schema Refinement Workflow
|
||||
|
||||
```bash
|
||||
markitect schema-generate example.md --output v1-schema.json
|
||||
|
||||
markitect validate test-doc.md v1-schema.json --detailed-errors
|
||||
|
||||
markitect schema-generate example.md --max-depth 2 --output v2-schema.json
|
||||
|
||||
markitect validate test-doc.md v2-schema.json
|
||||
```
|
||||
|
||||
## FILES
|
||||
|
||||
**\*.json**
|
||||
: JSON schema files defining document structure
|
||||
: Standard JSON Schema draft-07 format with MarkiTect extensions
|
||||
|
||||
**markitect.db**
|
||||
: Database storing ingested schemas
|
||||
: SQLite database in current directory or specified path
|
||||
|
||||
**.markitect.yml**
|
||||
: Configuration file for default schemas
|
||||
: YAML format with schema paths and validation rules
|
||||
|
||||
## EXIT STATUS
|
||||
|
||||
**0**
|
||||
: Success - document is valid
|
||||
|
||||
**1**
|
||||
: General error - file not found, invalid arguments
|
||||
|
||||
**2**
|
||||
: Configuration error - invalid schema file
|
||||
|
||||
**3**
|
||||
: Database error - schema storage/retrieval failed
|
||||
|
||||
**4**
|
||||
: Validation error - document does not conform to schema
|
||||
|
||||
## ENVIRONMENT
|
||||
|
||||
**MARKITECT_DATABASE**
|
||||
: Path to database file for schema storage
|
||||
: Default: markitect.db in current directory
|
||||
|
||||
**MARKITECT_SCHEMA_PATH**
|
||||
: Search path for schema files
|
||||
: Colon-separated list of directories
|
||||
|
||||
**MARKITECT_VALIDATION_STRICT**
|
||||
: Enable strict validation mode
|
||||
: Any non-empty value enables strict mode
|
||||
|
||||
## SEE ALSO
|
||||
|
||||
**markitect**(1), **json-schema**(7), **markdown-it**(7)
|
||||
|
||||
Related documentation:
|
||||
- JSON Schema Specification (https://json-schema.org/)
|
||||
- MarkiTect Schema Reference
|
||||
- AST Structure Documentation
|
||||
- Template System Guide
|
||||
|
||||
## LIMITATIONS
|
||||
|
||||
Schema validation has inherent limitations:
|
||||
|
||||
**Structure Only**
|
||||
|
||||
Schemas validate document structure, not content semantics. Cannot validate:
|
||||
- Factual correctness
|
||||
- Code functionality
|
||||
- Logical consistency
|
||||
- Language quality
|
||||
|
||||
**AST-Based**
|
||||
|
||||
Validation operates on parsed AST, not raw markdown. Some markdown formatting details may not be preserved or validated.
|
||||
|
||||
**Performance**
|
||||
|
||||
Large documents with complex schemas may have performance implications. AST caching mitigates this for repeated validations.
|
||||
|
||||
**Schema Complexity**
|
||||
|
||||
Very complex schemas can become difficult to maintain. Keep schemas as simple as possible while meeting requirements.
|
||||
|
||||
## BUGS
|
||||
|
||||
Report bugs at: https://github.com/markitect/markitect/issues
|
||||
|
||||
Known issues:
|
||||
- Schema generation from very large documents may be slow
|
||||
- Some edge cases in heading pattern matching
|
||||
- Limited support for custom markdown extensions
|
||||
|
||||
## AUTHORS
|
||||
|
||||
MarkiTect development team
|
||||
|
||||
Schema validation system designed for structured content management and documentation consistency.
|
||||
|
||||
## COPYRIGHT
|
||||
|
||||
Copyright (c) 2025 MarkiTect Project. Licensed under MIT License.
|
||||
|
||||
## VERSION
|
||||
|
||||
This manual documents schema validation in MarkiTect version 1.0 and later.
|
||||
Reference in New Issue
Block a user