Files
markitect-main/docs/SCHEMA_MANAGEMENT_GUIDE.md
tegwick fc828a345b
Some checks failed
Test Suite / performance-tests (push) Has been cancelled
Test Suite / code-quality (push) Has been cancelled
Test Suite / security-scan (push) Has been cancelled
Test Suite / unit-tests (3.11) (push) Has been cancelled
Test Suite / unit-tests (3.12) (push) Has been cancelled
Test Suite / integration-tests (push) Has been cancelled
Test Suite / e2e-tests (push) Has been cancelled
Test Suite / test-summary (push) Has been cancelled
docs: standardize on yymmdd- timestamp prefix format
Naming Convention Updates:
- Renamed history/2026-01-06-semantic-document-validation → history/260106-semantic-document-validation
- Documented yymmdd- format convention in history/README.md and roadmap/README.md
- Updated all date references in WORKPLAN.md and DONE.md
- Fixed SCHEMA_MANAGEMENT_GUIDE.md references to use yymmdd- format

Convention Details:
- Format: yymmdd-topic-name (e.g., 260106-semantic-document-validation)
- Benefits: Concise while maintaining chronological sorting
- Examples documented in both README files
- Applies to both roadmap/ and history/ directories

This establishes a consistent timestamp prefix convention that Claude and its agents should follow.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-06 03:57:42 +01:00

549 lines
13 KiB
Markdown

# Schema Management Guide
Complete guide to managing schemas in MarkiTect using the Schema-of-Schemas system.
## Overview
MarkiTect provides a comprehensive schema management system with:
- Markdown-first schema format with embedded JSON
- Strict naming conventions for consistency
- Metaschema validation for all schemas
- Multi-schema batch validation
- Schema registry with version tracking
## Quick Start
### 1. Create a New Schema
Create a markdown file following the naming convention: `{domain}-schema-v{major}.{minor}.md`
```bash
# Example: blog-post-schema-v1.0.md
```
**Template:**
```markdown
---
schema-id: https://markitect.dev/schemas/blog-post/v1.0
version: 1.0.0
status: stable
domain: blog-post
description: Schema for blog post documents
---
# Blog Post Schema v1.0.0
## Overview
This schema validates blog post documents with frontmatter and content sections.
## Schema Definition
```json
{
"$schema": "http://json-schema.org/draft-07/schema#",
"$id": "https://markitect.dev/schemas/blog-post/v1.0",
"title": "Blog Post Schema",
"description": "Schema for blog post documents",
"version": "1.0.0",
"type": "object",
"properties": {
"title": {
"type": "string",
"minLength": 1
},
"author": {
"type": "string"
},
"date": {
"type": "string",
"format": "date"
}
},
"required": ["title", "author"]
}
```
\`\`\`
### 2. Validate Your Schema
Validate against the metaschema to ensure it follows MarkiTect conventions:
```bash
# Validate a single schema file
markitect schema-validate ./blog-post-schema-v1.0.md
# See detailed errors
markitect schema-validate ./blog-post-schema-v1.0.md --detailed-errors
```
### 3. Ingest into Registry
Add your schema to the registry:
```bash
markitect schema-ingest blog-post-schema-v1.0.md
```
### 4. List Registered Schemas
View all schemas with numbered references:
```bash
# Simple format (default)
markitect schema-list
# Table format
markitect schema-list --format table
# JSON format
markitect schema-list --format json
```
**Output:**
```
Found 4 schema(s):
[1] 🔧 blog-post-schema-v1.0.md (added: 2026-01-05T10:30:00)
[2] 🔧 schema-schema-v1.0.md (added: 2026-01-05T03:33:42)
[3] 🔧 manpage-schema-v1.0.md (added: 2026-01-05T03:33:42)
[4] 🔧 api-documentation-schema-v1.0.md (added: 2026-01-05T03:33:35)
```
## Schema Validation
### Single Schema Validation
**By number:**
```bash
markitect schema-validate 1
```
**By filename (from registry):**
```bash
markitect schema-validate blog-post-schema-v1.0.md
```
**By filesystem path:**
```bash
markitect schema-validate ./my-schema.md
```
### Batch Validation
**Validate a range:**
```bash
markitect schema-validate 1-3
```
**Validate specific schemas:**
```bash
markitect schema-validate 1,3,5
```
**Validate all schemas:**
```bash
markitect schema-validate --all
```
**Output:**
```
Validating 4 schema(s)...
Results:
# Schema Status Details
--- -------------------------------- -------- ---------
1 blog-post-schema-v1.0.md ✅ Valid v1.0.0
2 schema-schema-v1.0.md ✅ Valid v1.0.0
3 manpage-schema-v1.0.md ✅ Valid v1.0.0
4 api-documentation-schema-v1.0.md ✅ Valid v1.0.0
Summary: 4 valid, 0 failed
```
## Document Validation (Semantic)
### Validate Documents Against Schemas
Beyond validating schema structure, MarkiTect can validate actual markdown documents against schemas, checking both structural (AST) and semantic (x-markitect extensions) aspects.
**Validate a document:**
```bash
# Full validation (structural + semantic)
markitect validate my-document.md --schema manpage-schema-v1.0.md
# Only structural validation (classic mode)
markitect validate my-document.md --schema schema.json --no-semantic
# With external link checking (may be slow)
markitect validate my-document.md --schema manpage-schema-v1.0.md --check-links
# Strict mode (warnings become errors)
markitect validate my-document.md --schema manpage-schema-v1.0.md --strict
```
### What is Validated
**Structural Validation** (always enabled):
- Document AST structure matches JSON Schema properties
- Heading counts, paragraph counts, code block counts
- Element types and nesting
**Semantic Validation** (enabled by default with --semantic):
- **Section Classifications**: Checks that documents have required sections, don't have improper sections
- REQUIRED sections must be present (ERROR if missing)
- RECOMMENDED sections should be present (WARNING if missing)
- IMPROPER sections must not be present (ERROR if found)
- DISCOURAGED sections should not be present (WARNING if found)
- OPTIONAL sections may or may not be present (no check)
- **Content Patterns**: Validates content matches regex patterns
- `required_patterns`: Content must match (ERROR if missing)
- `forbidden_patterns`: Content must not match (ERROR if found)
- `discouraged_patterns`: Content should not match (WARNING if found)
- **Quality Metrics**: Checks word counts, sentence counts
- `min_words`, `max_words`: Word count requirements (WARNING)
- `min_sentences`: Minimum sentence count (WARNING)
- **Link Validation**: Validates internal and external links (optional)
- Internal links: Checked by default when semantic validation enabled
- Fragment links (#section-name) verified to exist (ERROR if broken)
- Relative file paths checked for existence (ERROR if broken)
- External links: Opt-in with --check-links flag (may be slow)
- HTTP/HTTPS URLs validated with HEAD requests (WARNING if broken)
- Email validation: Validates mailto: link format (WARNING if invalid)
- Fragment policy: Configurable allow/disallow fragment identifiers
### Validation Output
```
Validation result: VALID
File: my-command.1.md
Schema: schema file: manpage-schema-v1.0.md
✅ Document structure matches schema requirements
============================================================
Semantic Validation Results:
============================================================
Section Validation:
✅ SYNOPSIS - Present (required)
✅ DESCRIPTION - Present (required)
✅ EXAMPLES - Present (recommended)
Content Validation:
✅ All content requirements met
Link Validation:
✅ All 12 links valid
Summary:
Sections checked: 3
Sections found: 5
Errors: 0
Warnings: 0
Status: PASSED ✅
```
### Common Validation Scenarios
**Example 1: Missing Required Section**
```bash
$ markitect validate doc.md --schema manpage-schema-v1.0.md
❌ Document validation failed
Section Validation:
❌ SYNOPSIS - SYNOPSIS section is mandatory
✅ DESCRIPTION - Present (required)
Errors: 1
Status: FAILED ❌
```
**Example 2: Forbidden Pattern Found**
```bash
$ markitect validate doc.md --schema manpage-schema-v1.0.md
Content Validation:
❌ SYNOPSIS - Forbidden pattern found: 'TODO'
Errors: 1
Status: FAILED ❌
```
**Example 3: Content Too Short (Warning)**
```bash
$ markitect validate doc.md --schema manpage-schema-v1.0.md
Content Validation:
⚠️ DESCRIPTION - Content too short (25 words, minimum 50)
Warnings: 1
Status: PASSED ✅
# With --strict flag, this would fail:
$ markitect validate doc.md --schema manpage-schema-v1.0.md --strict
Status: FAILED ❌ (warnings treated as errors)
```
**Example 4: Broken Internal Link**
```bash
$ markitect validate doc.md --schema manpage-schema-v1.0.md
Link Validation:
#nonexistent-section - Internal link target not found: #nonexistent-section
Errors: 1
Status: FAILED ❌
```
**Example 5: External Link Validation**
```bash
# Enable external link checking (may be slow)
$ markitect validate doc.md --schema manpage-schema-v1.0.md --check-links
Link Validation:
✅ http://example.com - Valid
⚠️ http://broken-link.invalid - External link unreachable: Name or service not known
Warnings: 1
Status: PASSED ✅
```
## Schema Naming Conventions
All schema filenames must follow this pattern:
```
{domain}-schema-v{major}.{minor}.md
```
### Rules
- **Domain**: Lowercase letters, numbers, and hyphens only
- **Version**: Major.minor format (e.g., `v1.0`, `v2.3`)
- **Extension**: Must be `.md`
- **No spaces**: Use hyphens for separation
### Valid Examples
- `blog-post-schema-v1.0.md`
- `api-documentation-schema-v2.1.md`
- `user-profile-schema-v1.0.md`
### Invalid Examples
- `BlogPost-schema-v1.0.md` (uppercase)
- `blog_post-schema-v1.0.md` (underscore)
- `blog-post-v1.0.md` (missing "schema")
- `blog-post-schema-v1.md` (missing minor version)
## Required Schema Fields
All schemas must include these fields:
### Frontmatter (YAML)
```yaml
---
schema-id: https://markitect.dev/schemas/{domain}/v{major}.{minor}
version: {major}.{minor}.{patch}
status: draft|stable|deprecated
domain: {domain}
description: Brief description
---
```
### JSON Schema
```json
{
"$schema": "http://json-schema.org/draft-07/schema#",
"$id": "https://markitect.dev/schemas/{domain}/v{major}.{minor}",
"title": "Schema Title",
"description": "Schema description",
"version": "{major}.{minor}.{patch}"
}
```
## Common Workflows
### Revalidate All Schemas After Metaschema Changes
When you update the metaschema, revalidate all registered schemas:
```bash
markitect schema-validate --all
```
### Check Schema Rigidity
Analyze a schema for overly rigid constraints:
```bash
markitect schema-analyze my-schema.md
```
### Refine a Rigid Schema
Automatically loosen overly specific constraints:
```bash
# Dry run (preview changes)
markitect schema-refine my-schema.md --dry-run
# Apply changes
markitect schema-refine my-schema.md
# Interactive mode
markitect schema-refine my-schema.md --interactive
```
### Get Schema Details
View schema metadata:
```bash
markitect schema-get blog-post-schema-v1.0.md
```
### Delete a Schema
Remove a schema from the registry:
```bash
markitect schema-delete blog-post-schema-v1.0.md --confirm
```
## Resolution Precedence
When validating schemas, MarkiTect uses this resolution order:
1. **Registry (by filename)**: Exact match in the database
2. **Filesystem (fallback)**: If not found in registry or looks like a path
### Examples
```bash
# Looks up in registry first
markitect schema-validate blog-post-schema-v1.0.md
# Forces filesystem lookup (contains /)
markitect schema-validate ./blog-post-schema-v1.0.md
# Also forces filesystem
markitect schema-validate ../schemas/blog-post-schema-v1.0.md
```
## Best Practices
### Schema Development
1. **Start with a template**: Use an existing schema as a starting point
2. **Validate early**: Validate against the metaschema before ingesting
3. **Use semantic versioning**: Major.minor.patch for all versions
4. **Document thoroughly**: Include overview, usage, and examples
5. **Test with real documents**: Validate actual documents against your schema
### Version Management
- **Increment major version**: Breaking changes to schema structure
- **Increment minor version**: Backward-compatible additions
- **Increment patch version**: Bug fixes and clarifications
### Schema Organization
```
markitect/schemas/
├── schema-schema-v1.0.md # Metaschema
├── manpage-schema-v1.0.md # Man page documents
├── api-documentation-schema-v1.0.md
├── terminology-schema-v1.0.md
└── blog-post-schema-v1.0.md # Your schemas
```
## Troubleshooting
### Schema Not Found
```
❌ Schema 'my-schema.md' not found in registry or filesystem
```
**Solution:** Use `markitect schema-list` to see available schemas, or provide a path: `./my-schema.md`
### Validation Fails
```
❌ Schema validation failed: my-schema.md
Found 2 validation error(s):
```
**Solution:** Check error messages and compare with metaschema requirements. Use `--detailed-errors` for more context.
### Invalid Selector
```
❌ Invalid selector: Range 1-10 is out of bounds. Valid range: 1-4
```
**Solution:** Use `markitect schema-list` to see valid numbers, or check your range syntax.
## Advanced Usage
### Scripting with Schema Commands
Validate schemas in CI/CD:
```bash
#!/bin/bash
# Validate all schemas and exit with error if any fail
if ! markitect schema-validate --all; then
echo "Schema validation failed!"
exit 1
fi
echo "All schemas valid"
```
### Batch Operations
```bash
# Validate recently added schemas
markitect schema-validate 1-3
# Validate specific critical schemas
markitect schema-validate 1,5,8
# Check just the metaschema
markitect schema-validate 2
```
## Schema Extensions
MarkiTect supports custom extensions in schemas:
- `x-markitect-sections`: Section classification (required, recommended, optional, discouraged, improper)
- `x-markitect-content-control`: Content validation rules and patterns
- `x-markitect-metadata`: Additional metadata for MarkiTect processing
See existing schemas for examples of these extensions.
## Future Enhancements
Planned features:
- Wildcard/globbing support: `markitect schema-validate */manpage*`
- Schema diff tool: Compare schema versions
- Schema migration assistant: Help upgrade documents to new schema versions
## Related Documentation
- [Schema Naming Specification](../history/260105-schema-of-schemas/SCHEMA_NAMING_SPEC.md)
- [Schema Loader Guide](../history/260105-schema-of-schemas/SCHEMA_LOADER_GUIDE.md)
- [Metaschema Reference](../markitect/schemas/schema-schema-v1.0.md)
- [Implementation Workplan](../history/260105-schema-of-schemas/WORKPLAN.md) (archived)
## Support
For issues or questions:
- Check existing schemas as examples
- Review metaschema validation errors carefully
- Use `--detailed-errors` for more context
- Consult the metaschema for requirements