diff --git a/docs/SCHEMA_MANAGEMENT_GUIDE.md b/docs/SCHEMA_MANAGEMENT_GUIDE.md index c8ce99f6..b6a06821 100644 --- a/docs/SCHEMA_MANAGEMENT_GUIDE.md +++ b/docs/SCHEMA_MANAGEMENT_GUIDE.md @@ -162,6 +162,115 @@ Results: Summary: 4 valid, 0 failed ``` +## Document Validation (Semantic) + +### Validate Documents Against Schemas + +Beyond validating schema structure, MarkiTect can validate actual markdown documents against schemas, checking both structural (AST) and semantic (x-markitect extensions) aspects. + +**Validate a document:** + +```bash +# Full validation (structural + semantic) +markitect validate my-document.md --schema manpage-schema-v1.0.md + +# Only structural validation (classic mode) +markitect validate my-document.md --schema schema.json --no-semantic + +# Strict mode (warnings become errors) +markitect validate my-document.md --schema manpage-schema-v1.0.md --strict +``` + +### What is Validated + +**Structural Validation** (always enabled): +- Document AST structure matches JSON Schema properties +- Heading counts, paragraph counts, code block counts +- Element types and nesting + +**Semantic Validation** (enabled by default with --semantic): +- **Section Classifications**: Checks that documents have required sections, don't have improper sections + - REQUIRED sections must be present (ERROR if missing) + - RECOMMENDED sections should be present (WARNING if missing) + - IMPROPER sections must not be present (ERROR if found) + - DISCOURAGED sections should not be present (WARNING if found) + - OPTIONAL sections may or may not be present (no check) +- **Content Patterns**: Validates content matches regex patterns + - `required_patterns`: Content must match (ERROR if missing) + - `forbidden_patterns`: Content must not match (ERROR if found) + - `discouraged_patterns`: Content should not match (WARNING if found) +- **Quality Metrics**: Checks word counts, sentence counts + - `min_words`, `max_words`: Word count requirements (WARNING) + - `min_sentences`: Minimum sentence count (WARNING) + +### Validation Output + +``` +Validation result: VALID +File: my-command.1.md +Schema: schema file: manpage-schema-v1.0.md +✅ Document structure matches schema requirements + +============================================================ +Semantic Validation Results: +============================================================ +Section Validation: + ✅ SYNOPSIS - Present (required) + ✅ DESCRIPTION - Present (required) + ✅ EXAMPLES - Present (recommended) + +Content Validation: + ✅ All content requirements met + +Summary: + Sections checked: 3 + Sections found: 5 + Errors: 0 + Warnings: 0 + Status: PASSED ✅ +``` + +### Common Validation Scenarios + +**Example 1: Missing Required Section** +```bash +$ markitect validate doc.md --schema manpage-schema-v1.0.md +❌ Document validation failed + +Section Validation: + ❌ SYNOPSIS - SYNOPSIS section is mandatory + ✅ DESCRIPTION - Present (required) + +Errors: 1 +Status: FAILED ❌ +``` + +**Example 2: Forbidden Pattern Found** +```bash +$ markitect validate doc.md --schema manpage-schema-v1.0.md + +Content Validation: + ❌ SYNOPSIS - Forbidden pattern found: 'TODO' + +Errors: 1 +Status: FAILED ❌ +``` + +**Example 3: Content Too Short (Warning)** +```bash +$ markitect validate doc.md --schema manpage-schema-v1.0.md + +Content Validation: + ⚠️ DESCRIPTION - Content too short (25 words, minimum 50) + +Warnings: 1 +Status: PASSED ✅ + +# With --strict flag, this would fail: +$ markitect validate doc.md --schema manpage-schema-v1.0.md --strict +Status: FAILED ❌ (warnings treated as errors) +``` + ## Schema Naming Conventions All schema filenames must follow this pattern: