docs: add comprehensive Phase 2 documentation and mark completion
Created detailed user guide for schema refinement tools: - Command reference for schema-analyze and schema-refine - Complete options and examples - Issue type explanations with before/after examples - Workflow guides (basic, interactive, CI/CD, migration) - Best practices and troubleshooting - Integration examples (Git hooks, Makefile, Python) - Rigidity score interpretation table Updated TODO.md to mark Phase 2 completion: - Documented all delivered features - Listed key capabilities (rigidity detection, auto-refine, interactive mode) - Noted test coverage (33 tests, 100% passing) - Added example results (60/100 → 24/100 rigidity reduction) Phase 2 is now complete and fully documented. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
This commit is contained in:
495
docs/user-guides/SCHEMA_REFINEMENT_TOOLS.md
Normal file
495
docs/user-guides/SCHEMA_REFINEMENT_TOOLS.md
Normal file
@@ -0,0 +1,495 @@
|
||||
# Schema Refinement Tools - User Guide
|
||||
|
||||
## Overview
|
||||
|
||||
MarkiTect Phase 2 introduces powerful schema refinement tools to help you analyze and improve JSON schemas for markdown validation. These tools detect rigidity issues and automatically apply fixes to make schemas more flexible and reusable.
|
||||
|
||||
## Quick Start
|
||||
|
||||
```bash
|
||||
# Analyze a schema for rigidity issues
|
||||
markitect schema-analyze examples/manpages/markdown-manpage-schema.json
|
||||
|
||||
# Refine a schema automatically
|
||||
markitect schema-refine examples/manpages/markdown-manpage-schema.json --output refined-schema.json
|
||||
|
||||
# Review each fix interactively
|
||||
markitect schema-refine examples/manpages/markdown-manpage-schema.json --interactive
|
||||
```
|
||||
|
||||
## Commands
|
||||
|
||||
### schema-analyze
|
||||
|
||||
Analyzes a JSON schema to detect rigidity issues and calculate a rigidity score (0-100).
|
||||
|
||||
#### Usage
|
||||
|
||||
```bash
|
||||
markitect schema-analyze <schema-file> [OPTIONS]
|
||||
```
|
||||
|
||||
#### Options
|
||||
|
||||
- `--verbose`, `-v`: Show detailed analysis with current and suggested values
|
||||
|
||||
#### Examples
|
||||
|
||||
```bash
|
||||
# Basic analysis
|
||||
markitect schema-analyze schema.json
|
||||
|
||||
# Verbose output with details
|
||||
markitect schema-analyze schema.json --verbose
|
||||
```
|
||||
|
||||
#### Output
|
||||
|
||||
The analyzer provides:
|
||||
|
||||
- **Rigidity Score** (0-100): Higher scores indicate more rigid schemas
|
||||
- 0-40: LOW - Flexible, good design
|
||||
- 41-70: MEDIUM - Some rigidity detected
|
||||
- 71-100: HIGH - Very rigid, needs refinement
|
||||
|
||||
- **Phase 1 Features**: Checks for classification system and content control
|
||||
- **Issue Count**: Breakdown by severity (Errors, Warnings, Info)
|
||||
- **Detected Issues**: List of problems with suggestions
|
||||
|
||||
#### Exit Codes
|
||||
|
||||
- `0`: Schema is flexible (score ≤ 50)
|
||||
- `1`: Schema is rigid (score > 50)
|
||||
- `2`: Error occurred
|
||||
|
||||
### schema-refine
|
||||
|
||||
Automatically refines rigid schemas by applying fixes for detected issues.
|
||||
|
||||
#### Usage
|
||||
|
||||
```bash
|
||||
markitect schema-refine <schema-file> [OPTIONS]
|
||||
```
|
||||
|
||||
#### Options
|
||||
|
||||
- `--output`, `-o PATH`: Output file (default: overwrite input file)
|
||||
- `--loosen-counts`: Convert exact counts to flexible ranges (default: enabled)
|
||||
- `--no-loosen-counts`: Disable count loosening
|
||||
- `--round-numbers`: Round overly specific numbers (default: enabled)
|
||||
- `--no-round-numbers`: Disable number rounding
|
||||
- `--migrate-deprecated`: Document deprecated extensions (default: disabled)
|
||||
- `--dry-run`: Show changes without applying them
|
||||
- `--interactive`, `-i`: Prompt for each refinement interactively
|
||||
|
||||
#### Examples
|
||||
|
||||
```bash
|
||||
# Refine schema in place
|
||||
markitect schema-refine schema.json
|
||||
|
||||
# Preview changes without applying
|
||||
markitect schema-refine schema.json --dry-run
|
||||
|
||||
# Save refined schema to new file
|
||||
markitect schema-refine schema.json --output refined-schema.json
|
||||
|
||||
# Review each fix interactively
|
||||
markitect schema-refine schema.json --interactive
|
||||
|
||||
# Disable specific refinements
|
||||
markitect schema-refine schema.json --no-loosen-counts
|
||||
```
|
||||
|
||||
#### Refinement Actions
|
||||
|
||||
The refiner automatically applies these fixes:
|
||||
|
||||
1. **Exact Count Loosening**: Converts exact counts to flexible ranges
|
||||
- Before: `"minItems": 5, "maxItems": 5`
|
||||
- After: `"minItems": 3, "maxItems": 10`
|
||||
|
||||
2. **Const Value Conversion**: Replaces exact value constraints with ranges
|
||||
- Before: `"const": 1`
|
||||
- After: `"minimum": 0, "maximum": 2`
|
||||
|
||||
3. **Number Rounding**: Rounds overly specific numbers
|
||||
- Before: `"minItems": 73`
|
||||
- After: `"minItems": 70`
|
||||
|
||||
4. **Range Widening**: Expands narrow integer ranges
|
||||
- Before: `"minimum": 5, "maximum": 6`
|
||||
- After: `"minimum": 0, "maximum": 11`
|
||||
|
||||
#### Exit Codes
|
||||
|
||||
- `0`: Success with changes applied
|
||||
- `1`: Success but no changes needed
|
||||
- `2`: Error occurred
|
||||
|
||||
## Issue Types
|
||||
|
||||
### Exact Count (WARNING)
|
||||
|
||||
**Problem**: Schema requires exact number of items, leaving no flexibility.
|
||||
|
||||
**Example**:
|
||||
```json
|
||||
{
|
||||
"type": "array",
|
||||
"minItems": 5,
|
||||
"maxItems": 5
|
||||
}
|
||||
```
|
||||
|
||||
**Fix**: Convert to a range
|
||||
```json
|
||||
{
|
||||
"type": "array",
|
||||
"minItems": 3,
|
||||
"maxItems": 10
|
||||
}
|
||||
```
|
||||
|
||||
### Const Value (WARNING)
|
||||
|
||||
**Problem**: Property must have exact value.
|
||||
|
||||
**Example**:
|
||||
```json
|
||||
{
|
||||
"type": "integer",
|
||||
"const": 1
|
||||
}
|
||||
```
|
||||
|
||||
**Fix**: Replace with range for numeric values
|
||||
```json
|
||||
{
|
||||
"type": "integer",
|
||||
"minimum": 0,
|
||||
"maximum": 2
|
||||
}
|
||||
```
|
||||
|
||||
### Overly Specific Numbers (INFO)
|
||||
|
||||
**Problem**: Numbers are too specific (like 73 instead of 70).
|
||||
|
||||
**Example**:
|
||||
```json
|
||||
{
|
||||
"type": "array",
|
||||
"minItems": 73
|
||||
}
|
||||
```
|
||||
|
||||
**Fix**: Round to nearest 10
|
||||
```json
|
||||
{
|
||||
"type": "array",
|
||||
"minItems": 70
|
||||
}
|
||||
```
|
||||
|
||||
### No Flexibility (INFO)
|
||||
|
||||
**Problem**: Integer range is too narrow.
|
||||
|
||||
**Example**:
|
||||
```json
|
||||
{
|
||||
"type": "integer",
|
||||
"minimum": 5,
|
||||
"maximum": 6
|
||||
}
|
||||
```
|
||||
|
||||
**Fix**: Widen the range
|
||||
```json
|
||||
{
|
||||
"type": "integer",
|
||||
"minimum": 0,
|
||||
"maximum": 11
|
||||
}
|
||||
```
|
||||
|
||||
### Missing Classifications (INFO)
|
||||
|
||||
**Problem**: Schema doesn't use the Phase 1 classification system.
|
||||
|
||||
**Suggestion**: Add `x-markitect-sections` to classify sections as required/recommended/optional/discouraged/improper.
|
||||
|
||||
### Missing Content Control (INFO)
|
||||
|
||||
**Problem**: Schema lacks content validation patterns and quality metrics.
|
||||
|
||||
**Suggestion**: Add `x-markitect-content-control` for pattern validation and quality requirements.
|
||||
|
||||
### Deprecated Extensions (WARNING)
|
||||
|
||||
**Problem**: Schema uses old extension format.
|
||||
|
||||
**Example**: `x-markitect-required-sections`
|
||||
|
||||
**Suggestion**: Migrate to `x-markitect-sections` with classification system.
|
||||
|
||||
## Workflows
|
||||
|
||||
### Basic Workflow: Analyze and Refine
|
||||
|
||||
1. **Analyze** your schema to understand issues:
|
||||
```bash
|
||||
markitect schema-analyze my-schema.json --verbose
|
||||
```
|
||||
|
||||
2. **Preview** refinements before applying:
|
||||
```bash
|
||||
markitect schema-refine my-schema.json --dry-run
|
||||
```
|
||||
|
||||
3. **Apply** refinements:
|
||||
```bash
|
||||
markitect schema-refine my-schema.json --output my-schema-refined.json
|
||||
```
|
||||
|
||||
4. **Verify** improvements:
|
||||
```bash
|
||||
markitect schema-analyze my-schema-refined.json
|
||||
```
|
||||
|
||||
### Interactive Workflow
|
||||
|
||||
For fine-grained control, use interactive mode:
|
||||
|
||||
```bash
|
||||
markitect schema-refine my-schema.json --interactive
|
||||
```
|
||||
|
||||
The tool will:
|
||||
1. Show each detected issue
|
||||
2. Display current and suggested values
|
||||
3. Prompt for confirmation (y/N/q)
|
||||
4. Apply only approved fixes
|
||||
|
||||
Example session:
|
||||
```
|
||||
Issue 1/4
|
||||
Type: exact_count
|
||||
Path: properties.headings.level_1
|
||||
Array 'level_1' requires exactly 1 items
|
||||
Suggestion: Use a range like minItems: 0, maxItems: 6
|
||||
Current: {"minItems": 1, "maxItems": 1}
|
||||
Suggested: {"minItems": 0, "maxItems": 6}
|
||||
|
||||
Apply this fix? [y/N/q]: y
|
||||
✓ Applied
|
||||
```
|
||||
|
||||
### CI/CD Integration
|
||||
|
||||
Use exit codes to enforce schema quality in your pipeline:
|
||||
|
||||
```bash
|
||||
#!/bin/bash
|
||||
|
||||
# Analyze schema and fail if rigid
|
||||
if ! markitect schema-analyze schema.json; then
|
||||
echo "Schema is too rigid (score > 50)"
|
||||
echo "Run: markitect schema-refine schema.json"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
echo "Schema quality check passed"
|
||||
```
|
||||
|
||||
### Schema Migration Workflow
|
||||
|
||||
Migrating from old format to Phase 1:
|
||||
|
||||
1. **Analyze** to identify deprecated extensions:
|
||||
```bash
|
||||
markitect schema-analyze old-schema.json
|
||||
```
|
||||
|
||||
2. **Document** deprecated extensions:
|
||||
```bash
|
||||
markitect schema-refine old-schema.json --migrate-deprecated
|
||||
```
|
||||
|
||||
3. **Manually migrate** to new format (automatic migration not implemented due to complexity)
|
||||
|
||||
## Best Practices
|
||||
|
||||
### When to Use schema-analyze
|
||||
|
||||
- Before committing schemas to version control
|
||||
- During code review to ensure quality
|
||||
- When creating new schemas from examples
|
||||
- To understand why a schema fails validation
|
||||
|
||||
### When to Use schema-refine
|
||||
|
||||
- After auto-generating schemas from documents
|
||||
- When inheriting legacy schemas
|
||||
- To quickly fix common rigidity issues
|
||||
- Before publishing schemas for reuse
|
||||
|
||||
### When to Use --interactive
|
||||
|
||||
- When you need fine-grained control
|
||||
- For schemas with domain-specific requirements
|
||||
- When learning about schema design
|
||||
- To review fixes before applying
|
||||
|
||||
### Recommended Settings
|
||||
|
||||
For most use cases:
|
||||
```bash
|
||||
# Balanced refinement (default)
|
||||
markitect schema-refine schema.json
|
||||
|
||||
# Conservative (preserve more constraints)
|
||||
markitect schema-refine schema.json --no-round-numbers
|
||||
|
||||
# Aggressive (maximum flexibility)
|
||||
markitect schema-refine schema.json --loosen-counts --round-numbers
|
||||
```
|
||||
|
||||
## Understanding Rigidity Scores
|
||||
|
||||
The rigidity score is calculated by weighting detected issues:
|
||||
|
||||
| Issue Type | Weight |
|
||||
|------------|--------|
|
||||
| Exact Count | 15 |
|
||||
| Overly Specific | 10 |
|
||||
| No Flexibility | 8 |
|
||||
| Missing Classifications | 5 |
|
||||
| Deprecated Extensions | 5 |
|
||||
| Missing Content Control | 3 |
|
||||
|
||||
**Score Interpretation**:
|
||||
- **0-20**: Excellent - Well-designed, flexible schema
|
||||
- **21-40**: Good - Minor improvements possible
|
||||
- **41-60**: Fair - Moderate rigidity, refinement recommended
|
||||
- **61-80**: Poor - Significant rigidity, refinement needed
|
||||
- **81-100**: Very Poor - Highly rigid, manual review recommended
|
||||
|
||||
## Integration Examples
|
||||
|
||||
### Git Pre-commit Hook
|
||||
|
||||
```bash
|
||||
#!/bin/bash
|
||||
# .git/hooks/pre-commit
|
||||
|
||||
SCHEMAS=$(git diff --cached --name-only --diff-filter=ACM | grep '\.json$')
|
||||
|
||||
for schema in $SCHEMAS; do
|
||||
if markitect schema-analyze "$schema" 2>&1 | grep -q "RIGID"; then
|
||||
echo "Error: $schema is too rigid"
|
||||
echo "Run: markitect schema-refine $schema"
|
||||
exit 1
|
||||
fi
|
||||
done
|
||||
```
|
||||
|
||||
### Makefile Target
|
||||
|
||||
```makefile
|
||||
.PHONY: check-schemas
|
||||
check-schemas:
|
||||
@for schema in schemas/*.json; do \
|
||||
echo "Checking $$schema..."; \
|
||||
markitect schema-analyze $$schema || exit 1; \
|
||||
done
|
||||
|
||||
.PHONY: refine-schemas
|
||||
refine-schemas:
|
||||
@for schema in schemas/*.json; do \
|
||||
echo "Refining $$schema..."; \
|
||||
markitect schema-refine $$schema; \
|
||||
done
|
||||
```
|
||||
|
||||
### Python Integration
|
||||
|
||||
```python
|
||||
import subprocess
|
||||
import json
|
||||
|
||||
def analyze_schema(schema_path):
|
||||
"""Analyze a schema and return rigidity score."""
|
||||
result = subprocess.run(
|
||||
["markitect", "schema-analyze", schema_path],
|
||||
capture_output=True,
|
||||
text=True
|
||||
)
|
||||
|
||||
# Parse output for score
|
||||
for line in result.stdout.split('\n'):
|
||||
if 'Rigidity Score:' in line:
|
||||
score = int(line.split(':')[1].split('/')[0].strip())
|
||||
return score
|
||||
return None
|
||||
|
||||
def refine_schema(schema_path, output_path):
|
||||
"""Refine a schema and save to output path."""
|
||||
result = subprocess.run(
|
||||
["markitect", "schema-refine", schema_path, "-o", output_path],
|
||||
capture_output=True,
|
||||
text=True
|
||||
)
|
||||
return result.returncode == 0
|
||||
|
||||
# Usage
|
||||
score = analyze_schema("schema.json")
|
||||
if score > 50:
|
||||
print(f"Schema is rigid (score: {score})")
|
||||
refine_schema("schema.json", "schema-refined.json")
|
||||
```
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Schema Not Found
|
||||
|
||||
**Error**: `Error: Schema file not found: schema.json`
|
||||
|
||||
**Solution**: Check file path and ensure file exists.
|
||||
|
||||
### Invalid JSON
|
||||
|
||||
**Error**: `Error: Invalid JSON in schema file`
|
||||
|
||||
**Solution**: Validate JSON syntax using `jsonlint` or similar tool.
|
||||
|
||||
### No Changes Applied
|
||||
|
||||
**Output**: `No refinements needed - schema is already flexible`
|
||||
|
||||
**Reason**: Schema doesn't have any detectable rigidity issues or has rigidity score < 50.
|
||||
|
||||
**Action**: Use `--verbose` to see all issues including INFO level.
|
||||
|
||||
### Refinement Broke Schema
|
||||
|
||||
**Problem**: Refined schema is too permissive.
|
||||
|
||||
**Solution**:
|
||||
1. Use `--interactive` to selectively apply fixes
|
||||
2. Use `--no-loosen-counts` or `--no-round-numbers` to preserve constraints
|
||||
3. Manually adjust ranges after refinement
|
||||
|
||||
## See Also
|
||||
|
||||
- [Schema Extensions Specification](../specifications/schema-extensions-spec.md) - Complete Phase 1 specification
|
||||
- [Schema Evolution Workplan](../../examples/manpages/SCHEMA_EVOLUTION_WORKPLAN.md) - Roadmap for schema features
|
||||
- [Manpage Example](../../examples/manpages/README.md) - Complete example demonstrating schema validation
|
||||
|
||||
## Support
|
||||
|
||||
For issues, questions, or feature requests:
|
||||
- GitHub Issues: https://github.com/anthropics/markitect/issues
|
||||
- Documentation: https://github.com/anthropics/markitect/docs
|
||||
Reference in New Issue
Block a user