Files
markitect-main/docs/SCHEMA_MANAGEMENT_GUIDE.md
tegwick fc828a345b
Some checks failed
Test Suite / performance-tests (push) Has been cancelled
Test Suite / code-quality (push) Has been cancelled
Test Suite / security-scan (push) Has been cancelled
Test Suite / unit-tests (3.11) (push) Has been cancelled
Test Suite / unit-tests (3.12) (push) Has been cancelled
Test Suite / integration-tests (push) Has been cancelled
Test Suite / e2e-tests (push) Has been cancelled
Test Suite / test-summary (push) Has been cancelled
docs: standardize on yymmdd- timestamp prefix format
Naming Convention Updates:
- Renamed history/2026-01-06-semantic-document-validation → history/260106-semantic-document-validation
- Documented yymmdd- format convention in history/README.md and roadmap/README.md
- Updated all date references in WORKPLAN.md and DONE.md
- Fixed SCHEMA_MANAGEMENT_GUIDE.md references to use yymmdd- format

Convention Details:
- Format: yymmdd-topic-name (e.g., 260106-semantic-document-validation)
- Benefits: Concise while maintaining chronological sorting
- Examples documented in both README files
- Applies to both roadmap/ and history/ directories

This establishes a consistent timestamp prefix convention that Claude and its agents should follow.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-06 03:57:42 +01:00

13 KiB

Schema Management Guide

Complete guide to managing schemas in MarkiTect using the Schema-of-Schemas system.

Overview

MarkiTect provides a comprehensive schema management system with:

  • Markdown-first schema format with embedded JSON
  • Strict naming conventions for consistency
  • Metaschema validation for all schemas
  • Multi-schema batch validation
  • Schema registry with version tracking

Quick Start

1. Create a New Schema

Create a markdown file following the naming convention: {domain}-schema-v{major}.{minor}.md

# Example: blog-post-schema-v1.0.md

Template:

---
schema-id: https://markitect.dev/schemas/blog-post/v1.0
version: 1.0.0
status: stable
domain: blog-post
description: Schema for blog post documents
---

# Blog Post Schema v1.0.0

## Overview
This schema validates blog post documents with frontmatter and content sections.

## Schema Definition

```json
{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "$id": "https://markitect.dev/schemas/blog-post/v1.0",
  "title": "Blog Post Schema",
  "description": "Schema for blog post documents",
  "version": "1.0.0",
  "type": "object",
  "properties": {
    "title": {
      "type": "string",
      "minLength": 1
    },
    "author": {
      "type": "string"
    },
    "date": {
      "type": "string",
      "format": "date"
    }
  },
  "required": ["title", "author"]
}

```

2. Validate Your Schema

Validate against the metaschema to ensure it follows MarkiTect conventions:

# Validate a single schema file
markitect schema-validate ./blog-post-schema-v1.0.md

# See detailed errors
markitect schema-validate ./blog-post-schema-v1.0.md --detailed-errors

3. Ingest into Registry

Add your schema to the registry:

markitect schema-ingest blog-post-schema-v1.0.md

4. List Registered Schemas

View all schemas with numbered references:

# Simple format (default)
markitect schema-list

# Table format
markitect schema-list --format table

# JSON format
markitect schema-list --format json

Output:

Found 4 schema(s):

[1] 🔧 blog-post-schema-v1.0.md         (added: 2026-01-05T10:30:00)
[2] 🔧 schema-schema-v1.0.md            (added: 2026-01-05T03:33:42)
[3] 🔧 manpage-schema-v1.0.md           (added: 2026-01-05T03:33:42)
[4] 🔧 api-documentation-schema-v1.0.md (added: 2026-01-05T03:33:35)

Schema Validation

Single Schema Validation

By number:

markitect schema-validate 1

By filename (from registry):

markitect schema-validate blog-post-schema-v1.0.md

By filesystem path:

markitect schema-validate ./my-schema.md

Batch Validation

Validate a range:

markitect schema-validate 1-3

Validate specific schemas:

markitect schema-validate 1,3,5

Validate all schemas:

markitect schema-validate --all

Output:

Validating 4 schema(s)...

Results:

  #  Schema                            Status    Details
---  --------------------------------  --------  ---------
  1  blog-post-schema-v1.0.md          ✅ Valid   v1.0.0
  2  schema-schema-v1.0.md             ✅ Valid   v1.0.0
  3  manpage-schema-v1.0.md            ✅ Valid   v1.0.0
  4  api-documentation-schema-v1.0.md  ✅ Valid   v1.0.0

Summary: 4 valid, 0 failed

Document Validation (Semantic)

Validate Documents Against Schemas

Beyond validating schema structure, MarkiTect can validate actual markdown documents against schemas, checking both structural (AST) and semantic (x-markitect extensions) aspects.

Validate a document:

# Full validation (structural + semantic)
markitect validate my-document.md --schema manpage-schema-v1.0.md

# Only structural validation (classic mode)
markitect validate my-document.md --schema schema.json --no-semantic

# With external link checking (may be slow)
markitect validate my-document.md --schema manpage-schema-v1.0.md --check-links

# Strict mode (warnings become errors)
markitect validate my-document.md --schema manpage-schema-v1.0.md --strict

What is Validated

Structural Validation (always enabled):

  • Document AST structure matches JSON Schema properties
  • Heading counts, paragraph counts, code block counts
  • Element types and nesting

Semantic Validation (enabled by default with --semantic):

  • Section Classifications: Checks that documents have required sections, don't have improper sections
    • REQUIRED sections must be present (ERROR if missing)
    • RECOMMENDED sections should be present (WARNING if missing)
    • IMPROPER sections must not be present (ERROR if found)
    • DISCOURAGED sections should not be present (WARNING if found)
    • OPTIONAL sections may or may not be present (no check)
  • Content Patterns: Validates content matches regex patterns
    • required_patterns: Content must match (ERROR if missing)
    • forbidden_patterns: Content must not match (ERROR if found)
    • discouraged_patterns: Content should not match (WARNING if found)
  • Quality Metrics: Checks word counts, sentence counts
    • min_words, max_words: Word count requirements (WARNING)
    • min_sentences: Minimum sentence count (WARNING)
  • Link Validation: Validates internal and external links (optional)
    • Internal links: Checked by default when semantic validation enabled
      • Fragment links (#section-name) verified to exist (ERROR if broken)
      • Relative file paths checked for existence (ERROR if broken)
    • External links: Opt-in with --check-links flag (may be slow)
      • HTTP/HTTPS URLs validated with HEAD requests (WARNING if broken)
    • Email validation: Validates mailto: link format (WARNING if invalid)
    • Fragment policy: Configurable allow/disallow fragment identifiers

Validation Output

Validation result: VALID
File: my-command.1.md
Schema: schema file: manpage-schema-v1.0.md
✅ Document structure matches schema requirements

============================================================
Semantic Validation Results:
============================================================
Section Validation:
  ✅ SYNOPSIS - Present (required)
  ✅ DESCRIPTION - Present (required)
  ✅ EXAMPLES - Present (recommended)

Content Validation:
  ✅ All content requirements met

Link Validation:
  ✅ All 12 links valid

Summary:
  Sections checked: 3
  Sections found: 5
  Errors: 0
  Warnings: 0
  Status: PASSED ✅

Common Validation Scenarios

Example 1: Missing Required Section

$ markitect validate doc.md --schema manpage-schema-v1.0.md
❌ Document validation failed

Section Validation:
  ❌ SYNOPSIS - SYNOPSIS section is mandatory
  ✅ DESCRIPTION - Present (required)

Errors: 1
Status: FAILED ❌

Example 2: Forbidden Pattern Found

$ markitect validate doc.md --schema manpage-schema-v1.0.md

Content Validation:
  ❌ SYNOPSIS - Forbidden pattern found: 'TODO'

Errors: 1
Status: FAILED ❌

Example 3: Content Too Short (Warning)

$ markitect validate doc.md --schema manpage-schema-v1.0.md

Content Validation:
  ⚠️  DESCRIPTION - Content too short (25 words, minimum 50)

Warnings: 1
Status: PASSED ✅

# With --strict flag, this would fail:
$ markitect validate doc.md --schema manpage-schema-v1.0.md --strict
Status: FAILED ❌  (warnings treated as errors)

Example 4: Broken Internal Link

$ markitect validate doc.md --schema manpage-schema-v1.0.md

Link Validation:
  ❌ #nonexistent-section - Internal link target not found: #nonexistent-section

Errors: 1
Status: FAILED ❌

Example 5: External Link Validation

# Enable external link checking (may be slow)
$ markitect validate doc.md --schema manpage-schema-v1.0.md --check-links

Link Validation:
  ✅ http://example.com - Valid
  ⚠️  http://broken-link.invalid - External link unreachable: Name or service not known

Warnings: 1
Status: PASSED ✅

Schema Naming Conventions

All schema filenames must follow this pattern:

{domain}-schema-v{major}.{minor}.md

Rules

  • Domain: Lowercase letters, numbers, and hyphens only
  • Version: Major.minor format (e.g., v1.0, v2.3)
  • Extension: Must be .md
  • No spaces: Use hyphens for separation

Valid Examples

  • blog-post-schema-v1.0.md
  • api-documentation-schema-v2.1.md
  • user-profile-schema-v1.0.md

Invalid Examples

  • BlogPost-schema-v1.0.md (uppercase)
  • blog_post-schema-v1.0.md (underscore)
  • blog-post-v1.0.md (missing "schema")
  • blog-post-schema-v1.md (missing minor version)

Required Schema Fields

All schemas must include these fields:

Frontmatter (YAML)

---
schema-id: https://markitect.dev/schemas/{domain}/v{major}.{minor}
version: {major}.{minor}.{patch}
status: draft|stable|deprecated
domain: {domain}
description: Brief description
---

JSON Schema

{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "$id": "https://markitect.dev/schemas/{domain}/v{major}.{minor}",
  "title": "Schema Title",
  "description": "Schema description",
  "version": "{major}.{minor}.{patch}"
}

Common Workflows

Revalidate All Schemas After Metaschema Changes

When you update the metaschema, revalidate all registered schemas:

markitect schema-validate --all

Check Schema Rigidity

Analyze a schema for overly rigid constraints:

markitect schema-analyze my-schema.md

Refine a Rigid Schema

Automatically loosen overly specific constraints:

# Dry run (preview changes)
markitect schema-refine my-schema.md --dry-run

# Apply changes
markitect schema-refine my-schema.md

# Interactive mode
markitect schema-refine my-schema.md --interactive

Get Schema Details

View schema metadata:

markitect schema-get blog-post-schema-v1.0.md

Delete a Schema

Remove a schema from the registry:

markitect schema-delete blog-post-schema-v1.0.md --confirm

Resolution Precedence

When validating schemas, MarkiTect uses this resolution order:

  1. Registry (by filename): Exact match in the database
  2. Filesystem (fallback): If not found in registry or looks like a path

Examples

# Looks up in registry first
markitect schema-validate blog-post-schema-v1.0.md

# Forces filesystem lookup (contains /)
markitect schema-validate ./blog-post-schema-v1.0.md

# Also forces filesystem
markitect schema-validate ../schemas/blog-post-schema-v1.0.md

Best Practices

Schema Development

  1. Start with a template: Use an existing schema as a starting point
  2. Validate early: Validate against the metaschema before ingesting
  3. Use semantic versioning: Major.minor.patch for all versions
  4. Document thoroughly: Include overview, usage, and examples
  5. Test with real documents: Validate actual documents against your schema

Version Management

  • Increment major version: Breaking changes to schema structure
  • Increment minor version: Backward-compatible additions
  • Increment patch version: Bug fixes and clarifications

Schema Organization

markitect/schemas/
├── schema-schema-v1.0.md        # Metaschema
├── manpage-schema-v1.0.md       # Man page documents
├── api-documentation-schema-v1.0.md
├── terminology-schema-v1.0.md
└── blog-post-schema-v1.0.md     # Your schemas

Troubleshooting

Schema Not Found

❌ Schema 'my-schema.md' not found in registry or filesystem

Solution: Use markitect schema-list to see available schemas, or provide a path: ./my-schema.md

Validation Fails

❌ Schema validation failed: my-schema.md
Found 2 validation error(s):

Solution: Check error messages and compare with metaschema requirements. Use --detailed-errors for more context.

Invalid Selector

❌ Invalid selector: Range 1-10 is out of bounds. Valid range: 1-4

Solution: Use markitect schema-list to see valid numbers, or check your range syntax.

Advanced Usage

Scripting with Schema Commands

Validate schemas in CI/CD:

#!/bin/bash
# Validate all schemas and exit with error if any fail
if ! markitect schema-validate --all; then
    echo "Schema validation failed!"
    exit 1
fi
echo "All schemas valid"

Batch Operations

# Validate recently added schemas
markitect schema-validate 1-3

# Validate specific critical schemas
markitect schema-validate 1,5,8

# Check just the metaschema
markitect schema-validate 2

Schema Extensions

MarkiTect supports custom extensions in schemas:

  • x-markitect-sections: Section classification (required, recommended, optional, discouraged, improper)
  • x-markitect-content-control: Content validation rules and patterns
  • x-markitect-metadata: Additional metadata for MarkiTect processing

See existing schemas for examples of these extensions.

Future Enhancements

Planned features:

  • Wildcard/globbing support: markitect schema-validate */manpage*
  • Schema diff tool: Compare schema versions
  • Schema migration assistant: Help upgrade documents to new schema versions

Support

For issues or questions:

  • Check existing schemas as examples
  • Review metaschema validation errors carefully
  • Use --detailed-errors for more context
  • Consult the metaschema for requirements