Created detailed user guide for schema refinement tools: - Command reference for schema-analyze and schema-refine - Complete options and examples - Issue type explanations with before/after examples - Workflow guides (basic, interactive, CI/CD, migration) - Best practices and troubleshooting - Integration examples (Git hooks, Makefile, Python) - Rigidity score interpretation table Updated TODO.md to mark Phase 2 completion: - Documented all delivered features - Listed key capabilities (rigidity detection, auto-refine, interactive mode) - Noted test coverage (33 tests, 100% passing) - Added example results (60/100 → 24/100 rigidity reduction) Phase 2 is now complete and fully documented. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
12 KiB
Schema Refinement Tools - User Guide
Overview
MarkiTect Phase 2 introduces powerful schema refinement tools to help you analyze and improve JSON schemas for markdown validation. These tools detect rigidity issues and automatically apply fixes to make schemas more flexible and reusable.
Quick Start
# Analyze a schema for rigidity issues
markitect schema-analyze examples/manpages/markdown-manpage-schema.json
# Refine a schema automatically
markitect schema-refine examples/manpages/markdown-manpage-schema.json --output refined-schema.json
# Review each fix interactively
markitect schema-refine examples/manpages/markdown-manpage-schema.json --interactive
Commands
schema-analyze
Analyzes a JSON schema to detect rigidity issues and calculate a rigidity score (0-100).
Usage
markitect schema-analyze <schema-file> [OPTIONS]
Options
--verbose,-v: Show detailed analysis with current and suggested values
Examples
# Basic analysis
markitect schema-analyze schema.json
# Verbose output with details
markitect schema-analyze schema.json --verbose
Output
The analyzer provides:
-
Rigidity Score (0-100): Higher scores indicate more rigid schemas
- 0-40: LOW - Flexible, good design
- 41-70: MEDIUM - Some rigidity detected
- 71-100: HIGH - Very rigid, needs refinement
-
Phase 1 Features: Checks for classification system and content control
-
Issue Count: Breakdown by severity (Errors, Warnings, Info)
-
Detected Issues: List of problems with suggestions
Exit Codes
0: Schema is flexible (score ≤ 50)1: Schema is rigid (score > 50)2: Error occurred
schema-refine
Automatically refines rigid schemas by applying fixes for detected issues.
Usage
markitect schema-refine <schema-file> [OPTIONS]
Options
--output,-o PATH: Output file (default: overwrite input file)--loosen-counts: Convert exact counts to flexible ranges (default: enabled)--no-loosen-counts: Disable count loosening--round-numbers: Round overly specific numbers (default: enabled)--no-round-numbers: Disable number rounding--migrate-deprecated: Document deprecated extensions (default: disabled)--dry-run: Show changes without applying them--interactive,-i: Prompt for each refinement interactively
Examples
# Refine schema in place
markitect schema-refine schema.json
# Preview changes without applying
markitect schema-refine schema.json --dry-run
# Save refined schema to new file
markitect schema-refine schema.json --output refined-schema.json
# Review each fix interactively
markitect schema-refine schema.json --interactive
# Disable specific refinements
markitect schema-refine schema.json --no-loosen-counts
Refinement Actions
The refiner automatically applies these fixes:
-
Exact Count Loosening: Converts exact counts to flexible ranges
- Before:
"minItems": 5, "maxItems": 5 - After:
"minItems": 3, "maxItems": 10
- Before:
-
Const Value Conversion: Replaces exact value constraints with ranges
- Before:
"const": 1 - After:
"minimum": 0, "maximum": 2
- Before:
-
Number Rounding: Rounds overly specific numbers
- Before:
"minItems": 73 - After:
"minItems": 70
- Before:
-
Range Widening: Expands narrow integer ranges
- Before:
"minimum": 5, "maximum": 6 - After:
"minimum": 0, "maximum": 11
- Before:
Exit Codes
0: Success with changes applied1: Success but no changes needed2: Error occurred
Issue Types
Exact Count (WARNING)
Problem: Schema requires exact number of items, leaving no flexibility.
Example:
{
"type": "array",
"minItems": 5,
"maxItems": 5
}
Fix: Convert to a range
{
"type": "array",
"minItems": 3,
"maxItems": 10
}
Const Value (WARNING)
Problem: Property must have exact value.
Example:
{
"type": "integer",
"const": 1
}
Fix: Replace with range for numeric values
{
"type": "integer",
"minimum": 0,
"maximum": 2
}
Overly Specific Numbers (INFO)
Problem: Numbers are too specific (like 73 instead of 70).
Example:
{
"type": "array",
"minItems": 73
}
Fix: Round to nearest 10
{
"type": "array",
"minItems": 70
}
No Flexibility (INFO)
Problem: Integer range is too narrow.
Example:
{
"type": "integer",
"minimum": 5,
"maximum": 6
}
Fix: Widen the range
{
"type": "integer",
"minimum": 0,
"maximum": 11
}
Missing Classifications (INFO)
Problem: Schema doesn't use the Phase 1 classification system.
Suggestion: Add x-markitect-sections to classify sections as required/recommended/optional/discouraged/improper.
Missing Content Control (INFO)
Problem: Schema lacks content validation patterns and quality metrics.
Suggestion: Add x-markitect-content-control for pattern validation and quality requirements.
Deprecated Extensions (WARNING)
Problem: Schema uses old extension format.
Example: x-markitect-required-sections
Suggestion: Migrate to x-markitect-sections with classification system.
Workflows
Basic Workflow: Analyze and Refine
-
Analyze your schema to understand issues:
markitect schema-analyze my-schema.json --verbose -
Preview refinements before applying:
markitect schema-refine my-schema.json --dry-run -
Apply refinements:
markitect schema-refine my-schema.json --output my-schema-refined.json -
Verify improvements:
markitect schema-analyze my-schema-refined.json
Interactive Workflow
For fine-grained control, use interactive mode:
markitect schema-refine my-schema.json --interactive
The tool will:
- Show each detected issue
- Display current and suggested values
- Prompt for confirmation (y/N/q)
- Apply only approved fixes
Example session:
Issue 1/4
Type: exact_count
Path: properties.headings.level_1
Array 'level_1' requires exactly 1 items
Suggestion: Use a range like minItems: 0, maxItems: 6
Current: {"minItems": 1, "maxItems": 1}
Suggested: {"minItems": 0, "maxItems": 6}
Apply this fix? [y/N/q]: y
✓ Applied
CI/CD Integration
Use exit codes to enforce schema quality in your pipeline:
#!/bin/bash
# Analyze schema and fail if rigid
if ! markitect schema-analyze schema.json; then
echo "Schema is too rigid (score > 50)"
echo "Run: markitect schema-refine schema.json"
exit 1
fi
echo "Schema quality check passed"
Schema Migration Workflow
Migrating from old format to Phase 1:
-
Analyze to identify deprecated extensions:
markitect schema-analyze old-schema.json -
Document deprecated extensions:
markitect schema-refine old-schema.json --migrate-deprecated -
Manually migrate to new format (automatic migration not implemented due to complexity)
Best Practices
When to Use schema-analyze
- Before committing schemas to version control
- During code review to ensure quality
- When creating new schemas from examples
- To understand why a schema fails validation
When to Use schema-refine
- After auto-generating schemas from documents
- When inheriting legacy schemas
- To quickly fix common rigidity issues
- Before publishing schemas for reuse
When to Use --interactive
- When you need fine-grained control
- For schemas with domain-specific requirements
- When learning about schema design
- To review fixes before applying
Recommended Settings
For most use cases:
# Balanced refinement (default)
markitect schema-refine schema.json
# Conservative (preserve more constraints)
markitect schema-refine schema.json --no-round-numbers
# Aggressive (maximum flexibility)
markitect schema-refine schema.json --loosen-counts --round-numbers
Understanding Rigidity Scores
The rigidity score is calculated by weighting detected issues:
| Issue Type | Weight |
|---|---|
| Exact Count | 15 |
| Overly Specific | 10 |
| No Flexibility | 8 |
| Missing Classifications | 5 |
| Deprecated Extensions | 5 |
| Missing Content Control | 3 |
Score Interpretation:
- 0-20: Excellent - Well-designed, flexible schema
- 21-40: Good - Minor improvements possible
- 41-60: Fair - Moderate rigidity, refinement recommended
- 61-80: Poor - Significant rigidity, refinement needed
- 81-100: Very Poor - Highly rigid, manual review recommended
Integration Examples
Git Pre-commit Hook
#!/bin/bash
# .git/hooks/pre-commit
SCHEMAS=$(git diff --cached --name-only --diff-filter=ACM | grep '\.json$')
for schema in $SCHEMAS; do
if markitect schema-analyze "$schema" 2>&1 | grep -q "RIGID"; then
echo "Error: $schema is too rigid"
echo "Run: markitect schema-refine $schema"
exit 1
fi
done
Makefile Target
.PHONY: check-schemas
check-schemas:
@for schema in schemas/*.json; do \
echo "Checking $$schema..."; \
markitect schema-analyze $$schema || exit 1; \
done
.PHONY: refine-schemas
refine-schemas:
@for schema in schemas/*.json; do \
echo "Refining $$schema..."; \
markitect schema-refine $$schema; \
done
Python Integration
import subprocess
import json
def analyze_schema(schema_path):
"""Analyze a schema and return rigidity score."""
result = subprocess.run(
["markitect", "schema-analyze", schema_path],
capture_output=True,
text=True
)
# Parse output for score
for line in result.stdout.split('\n'):
if 'Rigidity Score:' in line:
score = int(line.split(':')[1].split('/')[0].strip())
return score
return None
def refine_schema(schema_path, output_path):
"""Refine a schema and save to output path."""
result = subprocess.run(
["markitect", "schema-refine", schema_path, "-o", output_path],
capture_output=True,
text=True
)
return result.returncode == 0
# Usage
score = analyze_schema("schema.json")
if score > 50:
print(f"Schema is rigid (score: {score})")
refine_schema("schema.json", "schema-refined.json")
Troubleshooting
Schema Not Found
Error: Error: Schema file not found: schema.json
Solution: Check file path and ensure file exists.
Invalid JSON
Error: Error: Invalid JSON in schema file
Solution: Validate JSON syntax using jsonlint or similar tool.
No Changes Applied
Output: No refinements needed - schema is already flexible
Reason: Schema doesn't have any detectable rigidity issues or has rigidity score < 50.
Action: Use --verbose to see all issues including INFO level.
Refinement Broke Schema
Problem: Refined schema is too permissive.
Solution:
- Use
--interactiveto selectively apply fixes - Use
--no-loosen-countsor--no-round-numbersto preserve constraints - Manually adjust ranges after refinement
See Also
- Schema Extensions Specification - Complete Phase 1 specification
- Schema Evolution Workplan - Roadmap for schema features
- Manpage Example - Complete example demonstrating schema validation
Support
For issues, questions, or feature requests:
- GitHub Issues: https://github.com/anthropics/markitect/issues
- Documentation: https://github.com/anthropics/markitect/docs