Some checks failed
Test Suite / unit-tests (3.11) (push) Has been cancelled
Test Suite / unit-tests (3.12) (push) Has been cancelled
Test Suite / integration-tests (push) Has been cancelled
Test Suite / e2e-tests (push) Has been cancelled
Test Suite / performance-tests (push) Has been cancelled
Test Suite / code-quality (push) Has been cancelled
Test Suite / security-scan (push) Has been cancelled
Test Suite / test-summary (push) Has been cancelled
Added comprehensive OPTIONS section with 18 command-line options organized
into 4 categories:
1. Validation Options (5 options)
- --schema, --schema-json, --detailed-errors, --error-format, --quiet
2. Schema Generation Options (3 options)
- --output, --style, --title
3. Schema Management Options (4 options)
- --schema-list, --schema-info, --schema-delete, --confirm
4. Phase 2 Schema Refinement Options (6 options)
- --verbose, --dry-run, --interactive, --loosen-counts,
--round-numbers, --migrate-deprecated
This addresses the schema recommendation:
- Before: OPTIONS section missing (recommended but not present)
- After: OPTIONS section present with 424 words, 22 documented options
The manpage now fully complies with all schema recommendations:
✅ All required sections present (SYNOPSIS, DESCRIPTION)
✅ All recommended sections present (OPTIONS, EXAMPLES, SEE ALSO, COPYRIGHT)
✅ Document still validates successfully
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
902 lines
23 KiB
Markdown
902 lines
23 KiB
Markdown
# markdown-schema-validation(7) - Structured Document Validation with JSON Schema
|
|
|
|
## SYNOPSIS
|
|
|
|
**markitect schema-generate** *SOURCE_FILE* [**--output** *SCHEMA_FILE*]
|
|
|
|
**markitect schema-ingest** *SCHEMA_FILE*
|
|
|
|
**markitect validate** *DOCUMENT* *SCHEMA*
|
|
|
|
**markitect generate-stub** *SCHEMA* [**--output** *FILE*]
|
|
|
|
## DESCRIPTION
|
|
|
|
Markdown Schema Validation is MarkiTect's system for enforcing structural consistency in markdown documents. Unlike traditional markdown linters that check syntax, schema validation ensures documents conform to predefined structural patterns by validating their Abstract Syntax Tree (AST) representation against JSON Schema definitions.
|
|
|
|
This approach enables content management workflows where document structure is as important as content, making it ideal for technical documentation, business documents, and any scenario requiring consistent document templates.
|
|
|
|
### How Schema Validation Works
|
|
|
|
MarkiTect parses markdown files into an AST representation, then validates the AST structure against JSON schemas. The validation process checks:
|
|
|
|
- **Heading hierarchy** - Required heading levels and counts
|
|
- **Content elements** - Minimum and maximum paragraph counts
|
|
- **Structural patterns** - Presence of lists, code blocks, tables
|
|
- **Section organization** - Required and optional document sections
|
|
|
|
Schemas validate structure, not semantics. A document can pass validation while containing incorrect content, as long as the structure matches the schema.
|
|
|
|
## OPTIONS
|
|
|
|
### Validation Options
|
|
|
|
**--schema** *PATH*, **-s** *PATH*
|
|
: Path to JSON schema file for validation
|
|
: Used with **validate** command to specify schema location
|
|
|
|
**--schema-json** *TEXT*
|
|
: JSON schema provided as inline string
|
|
: Alternative to --schema for programmatic use
|
|
: Useful for testing or dynamic schema generation
|
|
|
|
**--detailed-errors**, **--errors**
|
|
: Show detailed validation errors with line numbers
|
|
: Provides specific locations and descriptions of failures
|
|
: Essential for debugging complex schema validation issues
|
|
|
|
**--error-format** *FORMAT*
|
|
: Format for error output: **text**, **json**, or **markdown**
|
|
: Default: **text**
|
|
: JSON format useful for CI/CD pipeline integration
|
|
: Markdown format for inclusion in documentation
|
|
|
|
**--quiet**, **-q**
|
|
: Only output validation result (true/false)
|
|
: Suppresses all other output for scripting
|
|
: Exit code indicates success (0) or failure (non-zero)
|
|
|
|
### Schema Generation Options
|
|
|
|
**--output** *PATH*, **-o** *PATH*
|
|
: Output file path for generated schema or document
|
|
: Used with **schema-generate** and **generate-stub** commands
|
|
: If omitted, outputs to stdout
|
|
|
|
**--style** *STYLE*
|
|
: Placeholder content style for **generate-stub** command
|
|
: Options: **default**, **custom**, **detailed**
|
|
: Affects the verbosity of generated stub content
|
|
|
|
**--title** *TEXT*
|
|
: Custom document title for generated stubs
|
|
: Overrides default title derived from schema
|
|
: Useful for creating multiple documents from one schema
|
|
|
|
### Schema Management Options
|
|
|
|
**--schema-list**
|
|
: List all available schemas in the library
|
|
: Shows schema names and descriptions
|
|
: Helps discover reusable schema patterns
|
|
|
|
**--schema-info** *SCHEMA_NAME*
|
|
: Display detailed information about a specific schema
|
|
: Shows schema structure, requirements, and metadata
|
|
: Useful for understanding schema capabilities before use
|
|
|
|
**--schema-delete** *SCHEMA_NAME*
|
|
: Remove a schema from the library
|
|
: Requires confirmation unless **--confirm** flag is used
|
|
: Irreversible operation - use with caution
|
|
|
|
**--confirm**
|
|
: Skip confirmation prompts for destructive operations
|
|
: Used with **schema-delete** and similar commands
|
|
: Useful for automation scripts
|
|
|
|
### Phase 2 Schema Refinement Options
|
|
|
|
**--verbose**, **-v**
|
|
: Show detailed analysis with current and suggested values
|
|
: Used with **schema-analyze** command
|
|
: Provides comprehensive rigidity assessment
|
|
|
|
**--dry-run**
|
|
: Preview refinement changes without applying them
|
|
: Used with **schema-refine** command
|
|
: Allows review before modifying schemas
|
|
|
|
**--interactive**, **-i**
|
|
: Prompt for each refinement interactively
|
|
: Used with **schema-refine** command
|
|
: Provides fine-grained control over applied fixes
|
|
|
|
**--loosen-counts**
|
|
: Convert exact counts to flexible ranges (default: enabled)
|
|
: Part of schema refinement process
|
|
: Can be disabled with **--no-loosen-counts**
|
|
|
|
**--round-numbers**
|
|
: Round overly specific numbers (default: enabled)
|
|
: Improves schema reusability
|
|
: Can be disabled with **--no-round-numbers**
|
|
|
|
**--migrate-deprecated**
|
|
: Document deprecated extension usage
|
|
: Helps identify schemas needing manual migration
|
|
: Does not automatically migrate (too risky)
|
|
|
|
## SCHEMA STRUCTURE
|
|
|
|
### JSON Schema Format
|
|
|
|
MarkiTect schemas are standard JSON Schema (draft-07) documents with custom extensions for markdown-specific validation.
|
|
|
|
#### Standard Properties
|
|
|
|
**properties.headings**
|
|
: Defines heading structure by level (level_1, level_2, level_3)
|
|
: Each level specifies minItems, maxItems, and content patterns
|
|
|
|
**properties.paragraphs**
|
|
: Array constraints for paragraph counts
|
|
: Validates document length and content density
|
|
|
|
**properties.code_blocks**
|
|
: Array constraints for code examples
|
|
: Ensures technical documentation includes examples
|
|
|
|
**properties.lists**
|
|
: Array constraints for list elements
|
|
: Validates presence of structured information
|
|
|
|
**properties.emphasis**
|
|
: Array constraints for bold and italic text
|
|
: Ensures appropriate use of emphasis
|
|
|
|
#### MarkiTect Extensions
|
|
|
|
MarkiTect extends JSON Schema with custom properties prefixed with **x-markitect-**:
|
|
|
|
**x-markitect-sections**
|
|
: Section classification and content control system
|
|
: Defines sections with five classification levels:
|
|
: - **required**: Must be present (validation fails if missing)
|
|
: - **recommended**: Should be present (warning if missing)
|
|
: - **optional**: May be present (no validation impact)
|
|
: - **discouraged**: Should not be present (warning if present)
|
|
: - **improper**: Must not be present (validation fails if present)
|
|
: Each section can specify content instructions, constraints, and custom messages
|
|
|
|
**x-markitect-content-control**
|
|
: Content validation rules for section content
|
|
: Defines required/discouraged/forbidden patterns
|
|
: Specifies content quality metrics (word count, readability)
|
|
: Provides content instructions for authors
|
|
|
|
**x-markitect-outline-mode**
|
|
: Boolean enabling outline-only validation
|
|
: Focuses on heading structure without content validation
|
|
|
|
**x-markitect-heading-text-capture**
|
|
: Boolean enabling exact heading text validation
|
|
: Enforces specific section names
|
|
|
|
## COMMANDS
|
|
|
|
### Schema Generation
|
|
|
|
**markitect schema-generate** *SOURCE_FILE*
|
|
: Analyzes markdown file AST and generates JSON schema
|
|
: Schema describes actual structure found in source document
|
|
|
|
**--output** *SCHEMA_FILE*
|
|
: Write schema to file instead of stdout
|
|
: Default: outputs to terminal
|
|
|
|
**--max-depth** *N*
|
|
: Limit heading analysis to depth N
|
|
: Useful for outline-focused schemas
|
|
|
|
### Schema Management
|
|
|
|
**markitect schema-ingest** *SCHEMA_FILE*
|
|
: Store schema in MarkiTect database
|
|
: Registers schema for reuse with validation commands
|
|
|
|
**markitect schema-list**
|
|
: Display all stored schemas
|
|
: Shows schema names and metadata
|
|
|
|
**markitect schema-get** *SCHEMA_NAME*
|
|
: Retrieve stored schema
|
|
: Outputs JSON schema to stdout
|
|
|
|
**markitect schema-delete** *SCHEMA_NAME*
|
|
: Remove schema from database
|
|
: Permanently deletes schema definition
|
|
|
|
### Document Validation
|
|
|
|
**markitect validate** *DOCUMENT* *SCHEMA*
|
|
: Validate markdown document against schema
|
|
: Returns exit code 0 for valid, 4 for invalid
|
|
|
|
**--detailed-errors**
|
|
: Show detailed validation error messages
|
|
: Includes suggestions for fixing violations
|
|
|
|
**--quiet**
|
|
: Suppress output, exit code only
|
|
: Useful for scripting and automation
|
|
|
|
### Template Generation
|
|
|
|
**markitect generate-stub** *SCHEMA*
|
|
: Generate markdown template from schema
|
|
: Creates document outline following schema structure
|
|
|
|
**--output** *FILE*
|
|
: Write template to file
|
|
: Default: outputs to stdout
|
|
|
|
## WORKFLOW
|
|
|
|
### Schema-Driven Development Workflow
|
|
|
|
The typical workflow for schema-based document management:
|
|
|
|
**1. Generate Schema from Example**
|
|
|
|
Create or identify an exemplar document with the desired structure, then generate its schema:
|
|
|
|
```bash
|
|
markitect schema-generate exemplar.md --output doc-schema.json
|
|
```
|
|
|
|
**2. Refine Schema**
|
|
|
|
Edit the generated schema to adjust constraints:
|
|
|
|
- Change minItems/maxItems for flexibility
|
|
- Add required-sections extensions
|
|
- Adjust heading patterns
|
|
- Add content instructions
|
|
|
|
**3. Store Schema**
|
|
|
|
Register schema for reuse:
|
|
|
|
```bash
|
|
markitect schema-ingest doc-schema.json
|
|
```
|
|
|
|
**4. Generate Templates**
|
|
|
|
Create document templates from schema:
|
|
|
|
```bash
|
|
markitect generate-stub doc-schema.json --output template.md
|
|
```
|
|
|
|
**5. Create Documents**
|
|
|
|
Write new documents using template as starting point, or use existing documents.
|
|
|
|
**6. Validate Documents**
|
|
|
|
Ensure documents conform to schema:
|
|
|
|
```bash
|
|
markitect validate new-document.md doc-schema.json
|
|
|
|
markitect validate new-document.md doc-schema.json --detailed-errors
|
|
```
|
|
|
|
**7. Iterate**
|
|
|
|
Fix validation errors and re-validate until document passes.
|
|
|
|
### Batch Validation Workflow
|
|
|
|
For managing multiple documents:
|
|
|
|
```bash
|
|
for doc in docs/*.md; do
|
|
markitect validate "$doc" doc-schema.json --quiet || echo "Failed: $doc"
|
|
done
|
|
```
|
|
|
|
## VALIDATION RULES
|
|
|
|
### Heading Validation
|
|
|
|
Schemas validate heading structure through the **headings** property:
|
|
|
|
**level_1** headings must appear exactly once (document title)
|
|
|
|
**level_2** headings represent major sections (minItems/maxItems set bounds)
|
|
|
|
**level_3** headings provide subsections (often optional with minItems: 0)
|
|
|
|
Heading content can be validated with **pattern** or **enum** constraints for exact section names.
|
|
|
|
### Content Element Validation
|
|
|
|
**Paragraphs** - Validates document has sufficient descriptive content
|
|
|
|
**Code blocks** - Ensures technical documents include examples
|
|
|
|
**Lists** - Validates structured information presence
|
|
|
|
**Emphasis** - Checks for appropriate use of bold/italic formatting
|
|
|
|
Constraints use **minItems** and **maxItems** to set acceptable ranges.
|
|
|
|
### Metadata Validation
|
|
|
|
The **metadata** property validates overall document characteristics:
|
|
|
|
**total_elements** - Total AST node count
|
|
|
|
**structure_types** - Array of AST node types present
|
|
|
|
Use **const** for exact matches or ranges for flexibility.
|
|
|
|
### Section Classification System
|
|
|
|
MarkiTect provides a five-level classification system for document sections through **x-markitect-sections**:
|
|
|
|
#### Required Sections
|
|
|
|
Sections marked as **required** must be present in the document. Validation fails with an error if missing.
|
|
|
|
```json
|
|
"SYNOPSIS": {
|
|
"classification": "required",
|
|
"error_message": "SYNOPSIS section is mandatory for all manpages"
|
|
}
|
|
```
|
|
|
|
**Validation Behavior**:
|
|
- Missing → ERROR → validation fails
|
|
- Present → Continue validation
|
|
|
|
#### Recommended Sections
|
|
|
|
Sections marked as **recommended** should be present. A warning is generated if missing, but validation succeeds.
|
|
|
|
```json
|
|
"EXAMPLES": {
|
|
"classification": "recommended",
|
|
"warning_if_missing": "Examples improve documentation usability"
|
|
}
|
|
```
|
|
|
|
**Validation Behavior**:
|
|
- Missing → WARNING → validation succeeds with warnings
|
|
- Present → Continue validation
|
|
|
|
#### Optional Sections
|
|
|
|
Sections marked as **optional** may or may not be present with no validation impact.
|
|
|
|
```json
|
|
"BUGS": {
|
|
"classification": "optional",
|
|
"content_instruction": "Known issues and bug reporting"
|
|
}
|
|
```
|
|
|
|
**Validation Behavior**:
|
|
- Missing → No impact
|
|
- Present → Continue validation
|
|
|
|
#### Discouraged Sections
|
|
|
|
Sections marked as **discouraged** should not be present. A warning is generated if found, but validation succeeds.
|
|
|
|
```json
|
|
"DEPRECATED": {
|
|
"classification": "discouraged",
|
|
"warning_if_missing": "Move deprecated content to HISTORY section"
|
|
}
|
|
```
|
|
|
|
**Validation Behavior**:
|
|
- Missing → No impact
|
|
- Present → WARNING → validation succeeds with warnings
|
|
|
|
#### Improper Sections
|
|
|
|
Sections marked as **improper** must not be present. Validation fails with an error if found.
|
|
|
|
```json
|
|
"TODO": {
|
|
"classification": "improper",
|
|
"error_message": "TODO sections must be removed before publication"
|
|
}
|
|
```
|
|
|
|
**Validation Behavior**:
|
|
- Missing → No impact
|
|
- Present → ERROR → validation fails
|
|
|
|
### Content Control
|
|
|
|
The **x-markitect-content-control** extension enables content-level validation:
|
|
|
|
#### Pattern Validation
|
|
|
|
**required_patterns** - Array of regex patterns that must appear in content:
|
|
```json
|
|
"required_patterns": ["\\*\\*command\\*\\*", "\\[.*\\]"]
|
|
```
|
|
|
|
**discouraged_patterns** - Patterns that should not appear (generates warnings):
|
|
```json
|
|
"discouraged_patterns": ["TODO", "FIXME", "\\bWIP\\b"]
|
|
```
|
|
|
|
**forbidden_patterns** - Patterns that must not appear (validation fails):
|
|
```json
|
|
"forbidden_patterns": ["password\\s*=", "api[_-]?key\\s*="]
|
|
```
|
|
|
|
#### Content Quality Metrics
|
|
|
|
Validate content length and readability:
|
|
|
|
```json
|
|
"content_quality": {
|
|
"min_words": 50,
|
|
"max_words": 1000,
|
|
"readability_target": "technical",
|
|
"min_sentences": 3
|
|
}
|
|
```
|
|
|
|
**Readability Targets**:
|
|
- **simple** - Elementary school level
|
|
- **general** - General audience
|
|
- **technical** - Technical audience (default for documentation)
|
|
- **advanced** - Expert/academic level
|
|
|
|
#### Content Instructions
|
|
|
|
Provide guidance for content authors:
|
|
|
|
```json
|
|
"content_instructions": [
|
|
"Show command name in bold",
|
|
"Use brackets [] for optional arguments",
|
|
"Keep synopsis concise (1-5 lines)"
|
|
]
|
|
```
|
|
|
|
These instructions appear in validation reports and generated templates.
|
|
|
|
## ERROR HANDLING
|
|
|
|
### Common Validation Errors
|
|
|
|
**Missing Required Section**
|
|
|
|
```
|
|
Error: Required section 'SYNOPSIS' not found
|
|
Suggestion: Add H2 heading '## SYNOPSIS' near document start
|
|
```
|
|
|
|
**Insufficient Content**
|
|
|
|
```
|
|
Error: Too few paragraphs (found 3, minimum 5 required)
|
|
Suggestion: Add descriptive content to meet minimum paragraph count
|
|
```
|
|
|
|
**Heading Count Mismatch**
|
|
|
|
```
|
|
Error: Too many H2 headings (found 15, maximum 13 allowed)
|
|
Suggestion: Combine related sections or adjust schema maxItems
|
|
```
|
|
|
|
**Structure Type Mismatch**
|
|
|
|
```
|
|
Error: Expected structure types not found: code_blocks
|
|
Suggestion: Add code examples using fenced code blocks
|
|
```
|
|
|
|
### Using Detailed Error Mode
|
|
|
|
Enable detailed errors for actionable feedback:
|
|
|
|
```bash
|
|
markitect validate document.md schema.json --detailed-errors
|
|
```
|
|
|
|
Output includes:
|
|
- Specific constraint violations
|
|
- Location information when available
|
|
- Suggestions for fixes
|
|
- Schema path to failing constraint
|
|
|
|
## SCHEMA DESIGN
|
|
|
|
### Best Practices
|
|
|
|
**Start with Real Documents**
|
|
|
|
Generate schemas from actual documents rather than writing from scratch. Real documents provide realistic constraints.
|
|
|
|
**Use Ranges, Not Exact Counts**
|
|
|
|
Allow flexibility with minItems/maxItems ranges:
|
|
|
|
```json
|
|
"paragraphs": {
|
|
"minItems": 10,
|
|
"maxItems": 100
|
|
}
|
|
```
|
|
|
|
Avoid exact counts (**const**) unless structure is truly rigid.
|
|
|
|
**Section Classification**
|
|
|
|
Use the five-level classification system to define section requirements:
|
|
|
|
```json
|
|
"x-markitect-sections": {
|
|
"SYNOPSIS": {
|
|
"classification": "required",
|
|
"content_instruction": "Brief command syntax",
|
|
"error_message": "SYNOPSIS is mandatory"
|
|
},
|
|
"EXAMPLES": {
|
|
"classification": "recommended",
|
|
"warning_if_missing": "Examples improve usability"
|
|
},
|
|
"BUGS": {
|
|
"classification": "optional"
|
|
}
|
|
}
|
|
```
|
|
|
|
Choose classifications based on importance:
|
|
- **required** for essential sections (SYNOPSIS, DESCRIPTION)
|
|
- **recommended** for important sections (EXAMPLES, SEE ALSO)
|
|
- **optional** for nice-to-have sections (BUGS, AUTHORS)
|
|
- **discouraged** for sections that should be elsewhere (DEPRECATED)
|
|
- **improper** for sections that must not appear (TODO, INTERNAL_NOTES)
|
|
|
|
**Heading Patterns**
|
|
|
|
Use regex patterns for flexible heading validation:
|
|
|
|
```json
|
|
"pattern": "^[A-Z][A-Z ]+$"
|
|
```
|
|
|
|
Matches UPPERCASE section names while allowing variation.
|
|
|
|
**Progressive Refinement**
|
|
|
|
Start with loose constraints, tighten based on validation experience with real documents.
|
|
|
|
### Anti-Patterns
|
|
|
|
**Over-Specification**
|
|
|
|
Avoid schemas that are too specific:
|
|
|
|
```json
|
|
"paragraphs": { "const": 47 }
|
|
```
|
|
|
|
This requires exactly 47 paragraphs, which is too rigid for most use cases.
|
|
|
|
**Under-Specification**
|
|
|
|
Avoid schemas that validate nothing:
|
|
|
|
```json
|
|
"paragraphs": { "minItems": 0 }
|
|
```
|
|
|
|
Provide meaningful constraints that ensure document quality.
|
|
|
|
**Semantic Validation**
|
|
|
|
Schemas validate structure, not content. Don't expect schemas to validate:
|
|
|
|
- Correct grammar or spelling
|
|
- Factual accuracy
|
|
- Code correctness
|
|
- Logical flow
|
|
|
|
Use other tools for semantic validation.
|
|
|
|
## INTEGRATION
|
|
|
|
### CI/CD Integration
|
|
|
|
Validate documentation in continuous integration:
|
|
|
|
```bash
|
|
markitect validate README.md readme-schema.json --quiet
|
|
exit_code=$?
|
|
|
|
if [ $exit_code -eq 0 ]; then
|
|
echo "Documentation valid"
|
|
else
|
|
echo "Documentation validation failed"
|
|
markitect validate README.md readme-schema.json --detailed-errors
|
|
exit 1
|
|
fi
|
|
```
|
|
|
|
### Git Hooks
|
|
|
|
Pre-commit hook for automatic validation:
|
|
|
|
```bash
|
|
changed_docs=$(git diff --cached --name-only --diff-filter=ACM | grep '.md$')
|
|
|
|
for doc in $changed_docs; do
|
|
schema="${doc%.md}-schema.json"
|
|
if [ -f "$schema" ]; then
|
|
markitect validate "$doc" "$schema" || exit 1
|
|
fi
|
|
done
|
|
```
|
|
|
|
### Build Systems
|
|
|
|
Makefile integration:
|
|
|
|
```makefile
|
|
.PHONY: validate-docs
|
|
validate-docs:
|
|
@for doc in docs/*.md; do \
|
|
markitect validate "$$doc" doc-schema.json || exit 1; \
|
|
done
|
|
|
|
.PHONY: build
|
|
build: validate-docs
|
|
# Build process continues only if docs validate
|
|
```
|
|
|
|
## EXAMPLES
|
|
|
|
### Generate Schema from Document
|
|
|
|
```bash
|
|
markitect schema-generate examples/invoice.md --output invoice-schema.json
|
|
```
|
|
|
|
### Store Schema for Reuse
|
|
|
|
```bash
|
|
markitect schema-ingest invoice-schema.json
|
|
markitect schema-list
|
|
```
|
|
|
|
### Validate Single Document
|
|
|
|
```bash
|
|
markitect validate draft-invoice.md invoice-schema.json
|
|
|
|
markitect validate draft-invoice.md invoice-schema.json --detailed-errors
|
|
```
|
|
|
|
### Batch Validation
|
|
|
|
```bash
|
|
for invoice in invoices/*.md; do
|
|
markitect validate "$invoice" invoice-schema.json --quiet
|
|
if [ $? -ne 0 ]; then
|
|
echo "Invalid: $invoice"
|
|
markitect validate "$invoice" invoice-schema.json --detailed-errors
|
|
fi
|
|
done
|
|
```
|
|
|
|
### Template Generation
|
|
|
|
```bash
|
|
markitect generate-stub invoice-schema.json --output new-invoice-template.md
|
|
|
|
cat new-invoice-template.md
|
|
|
|
markitect validate new-invoice-template.md invoice-schema.json
|
|
```
|
|
|
|
### Schema Refinement Workflow
|
|
|
|
```bash
|
|
markitect schema-generate example.md --output v1-schema.json
|
|
|
|
markitect validate test-doc.md v1-schema.json --detailed-errors
|
|
|
|
markitect schema-generate example.md --max-depth 2 --output v2-schema.json
|
|
|
|
markitect validate test-doc.md v2-schema.json
|
|
```
|
|
|
|
### Schema with Classification System
|
|
|
|
Create a schema with section classifications and content control:
|
|
|
|
```json
|
|
{
|
|
"$schema": "http://json-schema.org/draft-07/schema#",
|
|
"title": "Technical Documentation Schema",
|
|
"x-markitect-sections": {
|
|
"OVERVIEW": {
|
|
"classification": "required",
|
|
"heading_level": 2,
|
|
"content_instruction": "High-level description of the system",
|
|
"min_paragraphs": 2,
|
|
"error_message": "OVERVIEW section is required"
|
|
},
|
|
"EXAMPLES": {
|
|
"classification": "recommended",
|
|
"heading_level": 2,
|
|
"min_code_blocks": 2,
|
|
"warning_if_missing": "Examples help users understand usage"
|
|
},
|
|
"REFERENCES": {
|
|
"classification": "optional",
|
|
"heading_level": 2,
|
|
"content_instruction": "External documentation and resources"
|
|
},
|
|
"TODO": {
|
|
"classification": "improper",
|
|
"error_message": "Remove TODO sections before publishing"
|
|
}
|
|
},
|
|
"x-markitect-content-control": {
|
|
"overview": {
|
|
"discouraged_patterns": ["TODO", "FIXME"],
|
|
"forbidden_patterns": ["password", "secret"],
|
|
"content_quality": {
|
|
"min_words": 100,
|
|
"max_words": 500,
|
|
"readability_target": "technical"
|
|
}
|
|
}
|
|
},
|
|
"properties": {
|
|
"headings": {
|
|
"properties": {
|
|
"level_1": {"minItems": 1, "maxItems": 1},
|
|
"level_2": {"minItems": 2, "maxItems": 20}
|
|
}
|
|
},
|
|
"paragraphs": {"minItems": 10, "maxItems": 200},
|
|
"code_blocks": {"minItems": 1}
|
|
}
|
|
}
|
|
```
|
|
|
|
Validate documents against this schema:
|
|
|
|
```bash
|
|
# Missing required section = ERROR
|
|
markitect validate doc-without-overview.md tech-schema.json
|
|
# Result: INVALID - missing required section OVERVIEW
|
|
|
|
# Missing recommended section = WARNING
|
|
markitect validate doc-without-examples.md tech-schema.json
|
|
# Result: VALID (with warnings) - missing recommended section EXAMPLES
|
|
|
|
# Improper section present = ERROR
|
|
markitect validate doc-with-todo.md tech-schema.json
|
|
# Result: INVALID - improper section TODO must not be present
|
|
```
|
|
|
|
## FILES
|
|
|
|
**\*.json**
|
|
: JSON schema files defining document structure
|
|
: Standard JSON Schema draft-07 format with MarkiTect extensions
|
|
|
|
**markitect.db**
|
|
: Database storing ingested schemas
|
|
: SQLite database in current directory or specified path
|
|
|
|
**.markitect.yml**
|
|
: Configuration file for default schemas
|
|
: YAML format with schema paths and validation rules
|
|
|
|
## EXIT STATUS
|
|
|
|
**0**
|
|
: Success - document is valid
|
|
|
|
**1**
|
|
: General error - file not found, invalid arguments
|
|
|
|
**2**
|
|
: Configuration error - invalid schema file
|
|
|
|
**3**
|
|
: Database error - schema storage/retrieval failed
|
|
|
|
**4**
|
|
: Validation error - document does not conform to schema
|
|
|
|
## ENVIRONMENT
|
|
|
|
**MARKITECT_DATABASE**
|
|
: Path to database file for schema storage
|
|
: Default: markitect.db in current directory
|
|
|
|
**MARKITECT_SCHEMA_PATH**
|
|
: Search path for schema files
|
|
: Colon-separated list of directories
|
|
|
|
**MARKITECT_VALIDATION_STRICT**
|
|
: Enable strict validation mode
|
|
: Any non-empty value enables strict mode
|
|
|
|
## SEE ALSO
|
|
|
|
**markitect**(1), **json-schema**(7), **markdown-it**(7)
|
|
|
|
Related documentation:
|
|
- JSON Schema Specification (https://json-schema.org/)
|
|
- MarkiTect Schema Reference
|
|
- AST Structure Documentation
|
|
- Template System Guide
|
|
|
|
## LIMITATIONS
|
|
|
|
Schema validation has inherent limitations:
|
|
|
|
**Structure Only**
|
|
|
|
Schemas validate document structure, not content semantics. Cannot validate:
|
|
- Factual correctness
|
|
- Code functionality
|
|
- Logical consistency
|
|
- Language quality
|
|
|
|
**AST-Based**
|
|
|
|
Validation operates on parsed AST, not raw markdown. Some markdown formatting details may not be preserved or validated.
|
|
|
|
**Performance**
|
|
|
|
Large documents with complex schemas may have performance implications. AST caching mitigates this for repeated validations.
|
|
|
|
**Schema Complexity**
|
|
|
|
Very complex schemas can become difficult to maintain. Keep schemas as simple as possible while meeting requirements.
|
|
|
|
## BUGS
|
|
|
|
Report bugs at: https://github.com/markitect/markitect/issues
|
|
|
|
Known issues:
|
|
- Schema generation from very large documents may be slow
|
|
- Some edge cases in heading pattern matching
|
|
- Limited support for custom markdown extensions
|
|
|
|
## AUTHORS
|
|
|
|
MarkiTect development team
|
|
|
|
Schema validation system designed for structured content management and documentation consistency.
|
|
|
|
## COPYRIGHT
|
|
|
|
Copyright (c) 2025 MarkiTect Project. Licensed under MIT License.
|
|
|
|
## VERSION
|
|
|
|
This manual documents schema validation in MarkiTect version 1.0 and later.
|