Compare commits
4 Commits
c25795fb79
...
b5f510f9c7
| Author | SHA1 | Date | |
|---|---|---|---|
| b5f510f9c7 | |||
| 22008875d3 | |||
| 30b5f1c5bd | |||
| a3855f0dd5 |
64
AUTONOMOUS_WORK_REMINDER.md
Normal file
64
AUTONOMOUS_WORK_REMINDER.md
Normal file
@@ -0,0 +1,64 @@
|
||||
# Autonomous Work Reminder - TDD8 Implementation
|
||||
|
||||
## 🎯 MISSION: Complete Issue #50 - Metaschema Definition
|
||||
|
||||
**CRITICAL REMINDERS FOR AUTONOMOUS WORK:**
|
||||
|
||||
### 📋 TDD8 Workflow - NEVER SKIP STEPS
|
||||
1. **ISSUE** - Understand requirements (Issue #50 already analyzed)
|
||||
2. **TEST** - Write failing tests first (RED state required)
|
||||
3. **RED** - Verify tests fail before implementation
|
||||
4. **GREEN** - Implement minimal code to pass tests
|
||||
5. **REFACTOR** - Clean up code while keeping tests green
|
||||
6. **DOCUMENT** - Update documentation and help
|
||||
7. **REFINE** - Polish and optimize
|
||||
8. **PUBLISH** - Commit and close issue
|
||||
|
||||
### 🚨 AUTONOMOUS WORK PROTOCOLS
|
||||
|
||||
#### DO NOT FORGET TO:
|
||||
- ✅ Run tests after each change to verify state
|
||||
- ✅ Commit frequently with descriptive messages
|
||||
- ✅ Update CLI help when adding new features
|
||||
- ✅ Maintain backward compatibility
|
||||
- ✅ Follow existing code patterns and conventions
|
||||
- ✅ Use proper PYTHONPATH=. for all test runs
|
||||
- ✅ Close the issue when complete using: `make close-issue NUM=50`
|
||||
|
||||
#### QUALITY STANDARDS:
|
||||
- All tests must pass before moving to next TDD8 step
|
||||
- Code must follow existing project conventions
|
||||
- Documentation must be comprehensive
|
||||
- CLI integration must be complete and tested
|
||||
|
||||
#### ISSUE #50 SPECIFIC REQUIREMENTS:
|
||||
- Define JSON Schema metaschema for MarkiTect extensions
|
||||
- Support heading text capture
|
||||
- Support content field instructions
|
||||
- Support outline structure representation
|
||||
- Maintain backward compatibility with existing schemas
|
||||
- Include validation rules for new features
|
||||
|
||||
#### COMPLETION CRITERIA:
|
||||
- Metaschema JSON file created and validated
|
||||
- Tests cover all metaschema features
|
||||
- Documentation explains structure and usage
|
||||
- CLI can validate schemas against metaschema
|
||||
- All existing schemas still validate correctly
|
||||
|
||||
### 🔄 WORKFLOW COMMANDS
|
||||
```bash
|
||||
# Start work
|
||||
make tdd-start NUM=50
|
||||
|
||||
# Run tests
|
||||
PYTHONPATH=. python3 -m pytest tests/ --tb=short -q
|
||||
|
||||
# Commit work
|
||||
git add . && git commit -m "step: [TDD8_PHASE] description"
|
||||
|
||||
# Close issue when complete
|
||||
make close-issue NUM=50
|
||||
```
|
||||
|
||||
### 🎯 SUCCESS = Issue #50 completely implemented, tested, documented, and closed
|
||||
188
GAMEPLAN.md
Normal file
188
GAMEPLAN.md
Normal file
@@ -0,0 +1,188 @@
|
||||
# MarkiTect Schema Generation Capability Outline - GAMEPLAN
|
||||
|
||||
## 🎯 Mission: Transform MarkiTect from Static Analysis to Dynamic Generation
|
||||
|
||||
**Parent Issue**: [#46 - Schema generation capability outline](http://gitea.coulomb.social/coulomb/markitect_project/issues/46)
|
||||
|
||||
**Vision**: Enable users to generate document variations from example documents through schema-driven templates with content instructions and data automation.
|
||||
|
||||
---
|
||||
|
||||
## 📋 Issue Breakdown & Implementation Order
|
||||
|
||||
### **🏗️ Phase 1: Foundation (HIGH PRIORITY)**
|
||||
|
||||
#### Issue #50: Define metaschema for JSON schema structure
|
||||
- **Priority**: High
|
||||
- **Status**: Ready to start
|
||||
- **Dependencies**: Current schema generation (Issue #5), JSON Schema validation (Issue #7)
|
||||
- **Goal**: Create JSON Schema specification that extends standard JSON Schema with MarkiTect-specific features
|
||||
- **Key Features**:
|
||||
- Heading text capture support
|
||||
- Content field instructions support
|
||||
- Outline structure representation
|
||||
- Backward compatibility with existing schemas
|
||||
- **Start Command**: `make tdd-start NUM=50`
|
||||
|
||||
---
|
||||
|
||||
### **🔧 Phase 2: Core Features (HIGH-MEDIUM PRIORITY)**
|
||||
|
||||
#### Issue #51: Add outline mode to schema generation
|
||||
- **Priority**: High
|
||||
- **Dependencies**: Metaschema definition (Issue #50)
|
||||
- **Goal**: `markitect schema-generate --mode outline --depth 3 --outfile invoice.json example.md`
|
||||
- **Key Features**:
|
||||
- New `--mode outline` option
|
||||
- `--depth` parameter for control
|
||||
- Schema title: "Schema from example.md" (not "for")
|
||||
- Actual heading text capture
|
||||
|
||||
#### Issue #52: Capture actual heading text in schemas
|
||||
- **Priority**: Medium
|
||||
- **Dependencies**: Metaschema (Issue #50), Current schema generation (Issue #5)
|
||||
- **Goal**: Preserve exact heading text in schemas for validation
|
||||
- **Key Features**:
|
||||
- Store heading text alongside structure
|
||||
- Enable heading text validation
|
||||
- Meaningful error messages for mismatches
|
||||
|
||||
---
|
||||
|
||||
### **📝 Phase 3: Content Instructions (MEDIUM PRIORITY)**
|
||||
|
||||
#### Issue #54: Add content field instruction capabilities
|
||||
- **Priority**: Medium
|
||||
- **Dependencies**: Metaschema (Issue #50), Heading text capture (Issue #52)
|
||||
- **Goal**: Include guidance for content authors in schemas
|
||||
- **Key Features**:
|
||||
- Instructions for each section/content area
|
||||
- Support for different content types
|
||||
- Optional/required instruction flags
|
||||
- CLI support for adding instructions
|
||||
|
||||
---
|
||||
|
||||
### **🚀 Phase 4: Generation Pipeline (MEDIUM PRIORITY)**
|
||||
|
||||
#### Issue #55: Schema-based draft generation
|
||||
- **Priority**: Medium
|
||||
- **Dependencies**: All previous issues, Current stub generation (Issue #6)
|
||||
- **Goal**: Generate document templates from schemas with instructions
|
||||
- **Key Features**:
|
||||
- New CLI command for draft generation
|
||||
- Proper heading hierarchy from schema
|
||||
- Content instruction placeholders
|
||||
- Schema reference for future validation
|
||||
|
||||
---
|
||||
|
||||
### **🤖 Phase 5: Data Automation (LOW PRIORITY)**
|
||||
|
||||
#### Issue #56: Data-driven multiple draft generation
|
||||
- **Priority**: Low
|
||||
- **Dependencies**: Schema-based draft generation (Issue #55)
|
||||
- **Goal**: Batch document generation from data sources
|
||||
- **Key Features**:
|
||||
- Multiple data formats (JSON, CSV)
|
||||
- Field mapping from data to schema
|
||||
- Batch generation capabilities
|
||||
- Data validation against schema
|
||||
|
||||
---
|
||||
|
||||
## 🛣️ Complete User Workflow (Target State)
|
||||
|
||||
```bash
|
||||
# 1. Generate schema from example document
|
||||
markitect schema-generate --mode outline --depth 3 --outfile requirements_schema.json example_requirements.md
|
||||
|
||||
# 2. Tune the schema (manual editing)
|
||||
# - Remove overly specific elements
|
||||
# - Add content instructions
|
||||
# - Refine outline structure
|
||||
|
||||
# 3. Generate drafts from schema
|
||||
markitect generate-draft requirements_schema.json --outfile new_requirements.md
|
||||
|
||||
# 4. Data-driven batch generation (future)
|
||||
markitect generate-batch requirements_schema.json --data projects.csv --output-dir ./generated/
|
||||
|
||||
# 5. Validate generated documents
|
||||
markitect validate new_requirements.md requirements_schema.json
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 🎯 Implementation Strategy
|
||||
|
||||
### **Foundation-First Approach**
|
||||
1. **Start with Issue #50** - metaschema is prerequisite for everything
|
||||
2. **Parallel development** possible for Issues #51, #52 after #50
|
||||
3. **Sequential dependency** for Issues #54, #55, #56
|
||||
|
||||
### **TDD Workflow Integration**
|
||||
- Use `make tdd-start NUM=X` for each issue
|
||||
- Write tests first, implement features second
|
||||
- Maintain backward compatibility throughout
|
||||
|
||||
### **Testing Strategy**
|
||||
- Each issue requires comprehensive test coverage
|
||||
- Integration tests for end-to-end workflow
|
||||
- Performance testing for batch generation
|
||||
|
||||
### **Documentation Requirements**
|
||||
- CLI help updates for new options
|
||||
- User guide for complete workflow
|
||||
- API documentation for new schema features
|
||||
|
||||
---
|
||||
|
||||
## 📊 Success Metrics
|
||||
|
||||
### **Phase 1 Success**: Metaschema Defined
|
||||
- ✅ Extended JSON Schema with MarkiTect features
|
||||
- ✅ Backward compatibility maintained
|
||||
- ✅ Validation rules implemented
|
||||
|
||||
### **Phase 2 Success**: Outline Mode Working
|
||||
- ✅ `--mode outline` generates proper schemas
|
||||
- ✅ Heading text captured accurately
|
||||
- ✅ Depth control functional
|
||||
|
||||
### **Phase 3 Success**: Instructions Integrated
|
||||
- ✅ Content instructions in schemas
|
||||
- ✅ Instructions appear in generated drafts
|
||||
- ✅ Validation includes instruction compliance
|
||||
|
||||
### **Phase 4 Success**: Draft Generation
|
||||
- ✅ Schema-to-document generation working
|
||||
- ✅ Structured templates with placeholders
|
||||
- ✅ Round-trip validation (generate → validate)
|
||||
|
||||
### **Phase 5 Success**: Data Automation
|
||||
- ✅ Batch generation from data sources
|
||||
- ✅ Field mapping functionality
|
||||
- ✅ Production-ready automation pipeline
|
||||
|
||||
---
|
||||
|
||||
## 🚦 Current Status
|
||||
|
||||
**Active Phase**: Ready to start Phase 1
|
||||
**Next Action**: `make tdd-start NUM=50`
|
||||
**Estimated Timeline**: 6-8 development sessions across phases
|
||||
**Risk Level**: Low (building on solid foundation)
|
||||
|
||||
---
|
||||
|
||||
## 📝 Notes
|
||||
|
||||
- This gameplan transforms Issue #46 from concept to implementation roadmap
|
||||
- Each phase delivers user value incrementally
|
||||
- Foundation-first approach ensures stable architecture
|
||||
- TDD methodology maintains quality throughout development
|
||||
- End result: Powerful document automation pipeline for MarkiTect users
|
||||
|
||||
**Last Updated**: 2025-01-26
|
||||
**Status**: Active Gameplan
|
||||
71
ISSUE_WORKFLOW_REMINDER.md
Normal file
71
ISSUE_WORKFLOW_REMINDER.md
Normal file
@@ -0,0 +1,71 @@
|
||||
# Issue Management Workflow Reminder
|
||||
|
||||
## 🎯 CRITICAL REMINDER: Gitea is the Source of Truth
|
||||
|
||||
**PRIMARY RULE**: When discussing issues for assessment, feasibility evaluation, prioritization, or implementation planning, ALWAYS fetch the issue directly from Gitea.
|
||||
|
||||
## When to Fetch from Gitea
|
||||
|
||||
### ✅ Always Fetch from Gitea When:
|
||||
- Assessing feasibility of an issue
|
||||
- Deciding if we should implement an issue next
|
||||
- Refining issue requirements or scope
|
||||
- Evaluating whether to drop an issue
|
||||
- Discussing implementation strategy
|
||||
- Planning issue priority
|
||||
- Issue is not currently in the working directory
|
||||
- Issue has been implemented before but needs review
|
||||
|
||||
### ⚠️ Local Files Are Insufficient For:
|
||||
- Issue assessment discussions
|
||||
- Implementation planning
|
||||
- Priority evaluation
|
||||
- Scope refinement
|
||||
- Feasibility analysis
|
||||
|
||||
## Source of Truth Hierarchy
|
||||
|
||||
1. **Gitea Repository** - Primary datastore for all issues
|
||||
2. **Working Directory** - Only for issues currently being implemented
|
||||
3. **Local Index/Cache** - For quick reference only, not decision-making
|
||||
|
||||
## Proper Workflow
|
||||
|
||||
```bash
|
||||
# When discussing Issue #46 (or any issue number):
|
||||
1. Use WebFetch or GitLab/Gitea tools to fetch the live issue
|
||||
2. Read the current state, comments, and requirements
|
||||
3. Base all decisions on the live Gitea data
|
||||
4. Do NOT rely on local files, cached data, or assumptions
|
||||
```
|
||||
|
||||
## Implementation Commands
|
||||
|
||||
```bash
|
||||
# ✅ WORKING: Use existing Makefile targets
|
||||
make show-issue NUM=46 # Show detailed issue #46
|
||||
make list-issues # List all issues with status
|
||||
make list-open-issues # Show only open issues
|
||||
|
||||
# ✅ WORKING: Export for analysis
|
||||
make issues-get # Export compact TSV to ISSUES.index
|
||||
make issues-json # Export all issues as JSON
|
||||
make issues-csv # Export as CSV for spreadsheet analysis
|
||||
make issues-high # Export only high/critical priority
|
||||
|
||||
# ❌ NOT AVAILABLE: These require additional tools
|
||||
gh issue view 46 --repo your-repo
|
||||
WebFetch "https://gitea-instance/repo/issues/46" # (certificate issues)
|
||||
```
|
||||
|
||||
## Why This Matters
|
||||
|
||||
- **Accuracy**: Issues may have been updated, refined, or closed
|
||||
- **Completeness**: Comments and discussions provide crucial context
|
||||
- **Current State**: Status, labels, and priority may have changed
|
||||
- **Team Collaboration**: Other team members may have added insights
|
||||
- **Implementation History**: Previous attempts or decisions are documented
|
||||
|
||||
---
|
||||
|
||||
**🚨 REMINDER TO CLAUDE**: Before discussing any issue assessment, feasibility, or planning, ALWAYS fetch the issue from Gitea first. Local files are NOT sufficient for decision-making about issues.
|
||||
@@ -16,6 +16,9 @@ This document tracks Claude Code issues that directly impact our development wor
|
||||
- Remove resolved issues after confirming fixes work in our environment
|
||||
- Maintained by the claude-expert subagent as part of issue tracking responsibilities
|
||||
|
||||
**🎯 CRITICAL WORKFLOW REMINDER:**
|
||||
When discussing project issues (not Claude Code issues), ALWAYS fetch from Gitea first. Gitea is the source of truth for all issue assessment, feasibility evaluation, and implementation planning. Local files are insufficient for decision-making about issues. See ISSUE_WORKFLOW_REMINDER.md for complete workflow.
|
||||
|
||||
---
|
||||
|
||||
## Resolved Issues
|
||||
|
||||
@@ -1450,27 +1450,65 @@ def ast_stats(config, file_path, format):
|
||||
@click.argument('file_path', type=click.Path(exists=True, path_type=Path))
|
||||
@click.option('--max-depth', '-d', type=int, help='Maximum heading depth to include in schema')
|
||||
@click.option('--output', '-o', type=click.Path(path_type=Path), help='Output file path (default: stdout)')
|
||||
@click.option('--outfile', type=click.Path(path_type=Path), help='Output file path (alias for --output)')
|
||||
@click.option('--format', 'output_format', type=click.Choice(['json', 'yaml']), default='json', help='Output format')
|
||||
@click.option('--mode', type=click.Choice(['outline']), help='Generation mode: outline for structure-focused schemas')
|
||||
@click.option('--depth', type=int, help='Maximum depth for outline mode (similar to --max-depth)')
|
||||
@pass_config
|
||||
def generate_schema(config, file_path, max_depth, output, output_format):
|
||||
def generate_schema(config, file_path, max_depth, output, outfile, output_format, mode, depth):
|
||||
"""
|
||||
Generate a JSON schema from a markdown file's AST structure.
|
||||
|
||||
FILE_PATH: Path to the markdown file to analyze
|
||||
|
||||
Example:
|
||||
Examples:
|
||||
markitect schema-generate document.md
|
||||
markitect schema-generate document.md --max-depth 2
|
||||
markitect schema-generate document.md --output schema.json
|
||||
|
||||
# Outline mode for structure-focused schemas
|
||||
markitect schema-generate --mode outline document.md
|
||||
markitect schema-generate --mode outline --depth 3 --outfile schema.json document.md
|
||||
|
||||
Modes:
|
||||
Default: Standard schema generation with structural analysis
|
||||
Outline: Structure-focused schema with heading text capture and metaschema extensions
|
||||
"""
|
||||
try:
|
||||
# Handle parameter conflicts and defaults
|
||||
if outfile and output:
|
||||
click.echo("Error: Cannot specify both --output and --outfile", err=True)
|
||||
sys.exit(1)
|
||||
|
||||
# Use outfile as output if specified
|
||||
final_output = outfile or output
|
||||
|
||||
# Handle depth parameter for outline mode
|
||||
if mode == 'outline':
|
||||
if depth is not None and max_depth is not None:
|
||||
click.echo("Error: Cannot specify both --depth and --max-depth with outline mode", err=True)
|
||||
sys.exit(1)
|
||||
final_depth = depth if depth is not None else max_depth
|
||||
else:
|
||||
final_depth = max_depth
|
||||
|
||||
# Validate depth parameter
|
||||
if final_depth is not None and final_depth < 1:
|
||||
click.echo("Invalid depth parameter: depth must be >= 1", err=True)
|
||||
sys.exit(1)
|
||||
|
||||
# Initialize schema generator and associated files manager
|
||||
generator = SchemaGenerator()
|
||||
from .associated_files import AssociatedFilesManager
|
||||
associated_files = AssociatedFilesManager()
|
||||
|
||||
# Generate schema
|
||||
schema = generator.generate_schema_from_file(file_path, max_depth=max_depth)
|
||||
# Generate schema with mode support
|
||||
schema = generator.generate_schema_from_file(
|
||||
file_path,
|
||||
max_depth=final_depth,
|
||||
mode=mode,
|
||||
outline_depth=depth if mode == 'outline' else None
|
||||
)
|
||||
|
||||
# Format output
|
||||
if output_format == 'json':
|
||||
@@ -1481,18 +1519,18 @@ def generate_schema(config, file_path, max_depth, output, output_format):
|
||||
formatted_output = json.dumps(schema, indent=2, ensure_ascii=False)
|
||||
|
||||
# Mode-based output logic
|
||||
if not output and should_use_associated_files():
|
||||
if not final_output and should_use_associated_files():
|
||||
# Interactive mode: use associated file path
|
||||
from .associated_files import AssociatedFilesManager
|
||||
associated_files = AssociatedFilesManager()
|
||||
output = associated_files.get_associated_schema_path(file_path)
|
||||
final_output = associated_files.get_associated_schema_path(file_path)
|
||||
if config.get('verbose'):
|
||||
click.echo(f"Interactive mode: using associated file path: {output}", err=True)
|
||||
click.echo(f"Interactive mode: using associated file path: {final_output}", err=True)
|
||||
|
||||
# Write to output
|
||||
if output:
|
||||
output.write_text(formatted_output, encoding='utf-8')
|
||||
click.echo(f"Schema written to: {output}")
|
||||
if final_output:
|
||||
final_output.write_text(formatted_output, encoding='utf-8')
|
||||
click.echo(f"Schema written to: {final_output}")
|
||||
|
||||
# Show summary
|
||||
properties = schema.get('properties', {})
|
||||
@@ -1653,14 +1691,16 @@ def schema_ingest(config, schema_file, name):
|
||||
"""
|
||||
Read and store a JSON schema file in the database.
|
||||
|
||||
Implements Issue #3 functionality to ingest external schema files
|
||||
and store them for later use with validation and other operations.
|
||||
Validates schemas against the MarkiTect metaschema to ensure compatibility
|
||||
with MarkiTect features like heading text capture and content instructions.
|
||||
Implements Issue #3 and Issue #50 functionality.
|
||||
|
||||
SCHEMA_FILE: Path to the JSON schema file to store
|
||||
|
||||
Examples:
|
||||
markitect schema-ingest my_schema.json
|
||||
markitect schema-ingest external_schema.json --name custom-name
|
||||
markitect schema-ingest markitect_schema.json -v # Show metaschema validation
|
||||
"""
|
||||
try:
|
||||
# Determine schema name
|
||||
@@ -1677,6 +1717,25 @@ def schema_ingest(config, schema_file, name):
|
||||
click.echo(f"Error: Invalid JSON in schema file - {e}", err=True)
|
||||
sys.exit(1)
|
||||
|
||||
# Validate against MarkiTect metaschema
|
||||
from .metaschema import MetaschemaValidator
|
||||
try:
|
||||
metaschema_validator = MetaschemaValidator()
|
||||
validation_result = metaschema_validator.validate_schema_with_errors(schema_data)
|
||||
|
||||
if not validation_result.is_valid:
|
||||
click.echo("⚠️ Schema validation warnings against MarkiTect metaschema:", err=True)
|
||||
for error in validation_result.errors:
|
||||
click.echo(f" - {error.message}", err=True)
|
||||
click.echo(" Schema will be stored but may not be fully compatible with MarkiTect features.", err=True)
|
||||
else:
|
||||
if config.get('verbose'):
|
||||
click.echo("✅ Schema validates successfully against MarkiTect metaschema")
|
||||
|
||||
except Exception as e:
|
||||
if config.get('verbose'):
|
||||
click.echo(f"⚠️ Could not validate against metaschema: {e}", err=True)
|
||||
|
||||
# Initialize database and store schema
|
||||
from .database import DatabaseManager
|
||||
db_path = config.get('database', 'markitect.db')
|
||||
|
||||
196
markitect/metaschema.py
Normal file
196
markitect/metaschema.py
Normal file
@@ -0,0 +1,196 @@
|
||||
"""
|
||||
MarkiTect Metaschema Module for Issue #50
|
||||
|
||||
This module provides metaschema validation for MarkiTect JSON schemas,
|
||||
extending standard JSON Schema with MarkiTect-specific features.
|
||||
|
||||
This is a TDD8 implementation - tests are written first, implementation follows.
|
||||
"""
|
||||
|
||||
from pathlib import Path
|
||||
from typing import Dict, Any, List, Optional
|
||||
import json
|
||||
|
||||
# Path to the MarkiTect metaschema JSON file
|
||||
MARKITECT_METASCHEMA_PATH = Path(__file__).parent / "schemas" / "markitect-metaschema.json"
|
||||
|
||||
|
||||
class ValidationError:
|
||||
"""Represents a schema validation error."""
|
||||
|
||||
def __init__(self, message: str, path: str = ""):
|
||||
self.message = message
|
||||
self.path = path
|
||||
|
||||
|
||||
class ValidationResult:
|
||||
"""Result of schema validation against metaschema."""
|
||||
|
||||
def __init__(self, is_valid: bool, errors: List[ValidationError] = None):
|
||||
self.is_valid = is_valid
|
||||
self.errors = errors or []
|
||||
|
||||
|
||||
class MetaschemaValidator:
|
||||
"""Validates MarkiTect schemas against the MarkiTect metaschema."""
|
||||
|
||||
def __init__(self):
|
||||
"""Initialize the metaschema validator."""
|
||||
self._metaschema_cache = None
|
||||
|
||||
def get_metaschema(self) -> Dict[str, Any]:
|
||||
"""
|
||||
Get the MarkiTect metaschema.
|
||||
|
||||
Returns:
|
||||
Dictionary containing the metaschema
|
||||
|
||||
Raises:
|
||||
FileNotFoundError: If metaschema file doesn't exist
|
||||
json.JSONDecodeError: If metaschema file is invalid JSON
|
||||
"""
|
||||
if self._metaschema_cache is None:
|
||||
if not MARKITECT_METASCHEMA_PATH.exists():
|
||||
raise FileNotFoundError(f"Metaschema file not found: {MARKITECT_METASCHEMA_PATH}")
|
||||
|
||||
with open(MARKITECT_METASCHEMA_PATH) as f:
|
||||
self._metaschema_cache = json.load(f)
|
||||
|
||||
return self._metaschema_cache
|
||||
|
||||
def validate_schema(self, schema: Dict[str, Any]) -> bool:
|
||||
"""
|
||||
Validate a schema against the MarkiTect metaschema.
|
||||
|
||||
Args:
|
||||
schema: The schema to validate
|
||||
|
||||
Returns:
|
||||
True if valid, False otherwise
|
||||
"""
|
||||
result = self.validate_schema_with_errors(schema)
|
||||
return result.is_valid
|
||||
|
||||
def validate_schema_with_errors(self, schema: Dict[str, Any]) -> ValidationResult:
|
||||
"""
|
||||
Validate a schema and return detailed error information.
|
||||
|
||||
Args:
|
||||
schema: The schema to validate
|
||||
|
||||
Returns:
|
||||
ValidationResult with validity status and error details
|
||||
"""
|
||||
errors = []
|
||||
|
||||
# Basic JSON Schema validation - check required properties
|
||||
if not isinstance(schema, dict):
|
||||
return ValidationResult(False, [ValidationError("Schema must be an object")])
|
||||
|
||||
# Check for required JSON Schema properties
|
||||
if "$schema" not in schema:
|
||||
errors.append(ValidationError("Missing required $schema property"))
|
||||
|
||||
if "type" not in schema:
|
||||
errors.append(ValidationError("Missing required type property"))
|
||||
|
||||
# Validate MarkiTect extensions
|
||||
errors.extend(self._validate_markitect_extensions(schema))
|
||||
|
||||
return ValidationResult(len(errors) == 0, errors)
|
||||
|
||||
def _validate_markitect_extensions(self, schema: Dict[str, Any]) -> List[ValidationError]:
|
||||
"""Validate MarkiTect-specific extensions in the schema."""
|
||||
errors = []
|
||||
|
||||
# Define validation rules for MarkiTect extensions
|
||||
validation_rules = {
|
||||
"x-markitect-outline-depth": self._validate_outline_depth,
|
||||
"x-markitect-outline-mode": self._validate_outline_mode,
|
||||
"x-markitect-heading-text": self._validate_heading_text,
|
||||
"x-markitect-content-instructions": self._validate_content_instructions,
|
||||
"x-markitect-instruction-type": self._validate_instruction_type,
|
||||
"x-markitect-generation-mode": self._validate_generation_mode,
|
||||
"x-markitect-generated-from": self._validate_generated_from,
|
||||
}
|
||||
|
||||
# Apply validation rules
|
||||
for property_name, validator in validation_rules.items():
|
||||
if property_name in schema:
|
||||
error = validator(schema[property_name], property_name)
|
||||
if error:
|
||||
errors.append(error)
|
||||
|
||||
# Recursively validate nested properties
|
||||
if "properties" in schema:
|
||||
for prop_name, prop_schema in schema["properties"].items():
|
||||
if isinstance(prop_schema, dict):
|
||||
nested_errors = self._validate_markitect_extensions(prop_schema)
|
||||
errors.extend(nested_errors)
|
||||
|
||||
return errors
|
||||
|
||||
def _validate_outline_depth(self, value: Any, property_name: str) -> Optional[ValidationError]:
|
||||
"""Validate x-markitect-outline-depth property."""
|
||||
if not isinstance(value, int) or value < 1:
|
||||
return ValidationError(
|
||||
"x-markitect-outline-depth must be an integer >= 1",
|
||||
property_name
|
||||
)
|
||||
return None
|
||||
|
||||
def _validate_outline_mode(self, value: Any, property_name: str) -> Optional[ValidationError]:
|
||||
"""Validate x-markitect-outline-mode property."""
|
||||
if not isinstance(value, bool):
|
||||
return ValidationError(
|
||||
"x-markitect-outline-mode must be a boolean",
|
||||
property_name
|
||||
)
|
||||
return None
|
||||
|
||||
def _validate_heading_text(self, value: Any, property_name: str) -> Optional[ValidationError]:
|
||||
"""Validate x-markitect-heading-text property."""
|
||||
if not isinstance(value, str):
|
||||
return ValidationError(
|
||||
"x-markitect-heading-text must be a string",
|
||||
property_name
|
||||
)
|
||||
return None
|
||||
|
||||
def _validate_content_instructions(self, value: Any, property_name: str) -> Optional[ValidationError]:
|
||||
"""Validate x-markitect-content-instructions property."""
|
||||
if not isinstance(value, str):
|
||||
return ValidationError(
|
||||
"x-markitect-content-instructions must be a string",
|
||||
property_name
|
||||
)
|
||||
return None
|
||||
|
||||
def _validate_instruction_type(self, value: Any, property_name: str) -> Optional[ValidationError]:
|
||||
"""Validate x-markitect-instruction-type property."""
|
||||
valid_types = ["description", "example", "constraint", "template"]
|
||||
if not isinstance(value, str) or value not in valid_types:
|
||||
return ValidationError(
|
||||
f"x-markitect-instruction-type must be one of {valid_types}",
|
||||
property_name
|
||||
)
|
||||
return None
|
||||
|
||||
def _validate_generation_mode(self, value: Any, property_name: str) -> Optional[ValidationError]:
|
||||
"""Validate x-markitect-generation-mode property."""
|
||||
valid_modes = ["outline", "full"]
|
||||
if not isinstance(value, str) or value not in valid_modes:
|
||||
return ValidationError(
|
||||
f"x-markitect-generation-mode must be one of {valid_modes}",
|
||||
property_name
|
||||
)
|
||||
return None
|
||||
|
||||
def _validate_generated_from(self, value: Any, property_name: str) -> Optional[ValidationError]:
|
||||
"""Validate x-markitect-generated-from property."""
|
||||
if not isinstance(value, str):
|
||||
return ValidationError(
|
||||
"x-markitect-generated-from must be a string",
|
||||
property_name
|
||||
)
|
||||
return None
|
||||
@@ -28,13 +28,21 @@ class SchemaGenerator:
|
||||
"""Initialize the schema generator."""
|
||||
self.default_schema_url = "http://json-schema.org/draft-07/schema#"
|
||||
|
||||
def generate_schema_from_file(self, file_path: Path, max_depth: Optional[int] = None) -> Dict[str, Any]:
|
||||
def generate_schema_from_file(
|
||||
self,
|
||||
file_path: Path,
|
||||
max_depth: Optional[int] = None,
|
||||
mode: Optional[str] = None,
|
||||
outline_depth: Optional[int] = None
|
||||
) -> Dict[str, Any]:
|
||||
"""
|
||||
Generate a JSON schema from a markdown file's AST structure.
|
||||
|
||||
Args:
|
||||
file_path: Path to the markdown file
|
||||
max_depth: Maximum heading depth to include (None = unlimited)
|
||||
mode: Generation mode ('outline' for structure-focused schemas)
|
||||
outline_depth: Depth limit for outline mode
|
||||
|
||||
Returns:
|
||||
JSON schema as a dictionary
|
||||
@@ -58,7 +66,7 @@ class SchemaGenerator:
|
||||
structure_analysis = self._analyze_ast_structure(ast_tokens, max_depth)
|
||||
|
||||
# Generate the JSON schema
|
||||
schema = self._create_json_schema(structure_analysis, file_path.name)
|
||||
schema = self._create_json_schema(structure_analysis, file_path.name, mode=mode, outline_depth=outline_depth)
|
||||
|
||||
return schema
|
||||
|
||||
@@ -170,25 +178,42 @@ class SchemaGenerator:
|
||||
|
||||
return analysis
|
||||
|
||||
def _create_json_schema(self, analysis: Dict[str, Any], filename: str) -> Dict[str, Any]:
|
||||
def _create_json_schema(
|
||||
self,
|
||||
analysis: Dict[str, Any],
|
||||
filename: str,
|
||||
mode: Optional[str] = None,
|
||||
outline_depth: Optional[int] = None
|
||||
) -> Dict[str, Any]:
|
||||
"""
|
||||
Create a JSON schema from structural analysis.
|
||||
|
||||
Args:
|
||||
analysis: Structural analysis of the document
|
||||
filename: Name of the source file
|
||||
mode: Generation mode ('outline' for structure-focused schemas)
|
||||
outline_depth: Depth limit for outline mode
|
||||
|
||||
Returns:
|
||||
JSON schema dictionary
|
||||
"""
|
||||
# Determine title format based on mode
|
||||
title_preposition = "from" if mode == "outline" else "for"
|
||||
|
||||
schema = {
|
||||
"$schema": self.default_schema_url,
|
||||
"type": "object",
|
||||
"title": f"Schema for {filename}",
|
||||
"title": f"Schema {title_preposition} {filename}",
|
||||
"description": f"JSON schema describing the structure of {filename}",
|
||||
"properties": {}
|
||||
}
|
||||
|
||||
# Add metaschema extensions for outline mode
|
||||
if mode == "outline":
|
||||
schema["x-markitect-outline-mode"] = True
|
||||
if outline_depth is not None:
|
||||
schema["x-markitect-outline-depth"] = outline_depth
|
||||
|
||||
# Add heading structure
|
||||
if analysis['headings']:
|
||||
heading_properties = {}
|
||||
|
||||
52
markitect/schemas/markitect-metaschema.json
Normal file
52
markitect/schemas/markitect-metaschema.json
Normal file
@@ -0,0 +1,52 @@
|
||||
{
|
||||
"$schema": "http://json-schema.org/draft-07/schema#",
|
||||
"$id": "https://markitect.io/schemas/markitect-metaschema.json",
|
||||
"type": "object",
|
||||
"title": "MarkiTect Extended JSON Schema Metaschema",
|
||||
"description": "Metaschema for MarkiTect JSON schemas that extends standard JSON Schema with MarkiTect-specific features for document structure analysis and generation",
|
||||
"allOf": [
|
||||
{
|
||||
"$ref": "http://json-schema.org/draft-07/schema#"
|
||||
},
|
||||
{
|
||||
"properties": {
|
||||
"x-markitect-heading-text": {
|
||||
"type": "string",
|
||||
"description": "Preserve actual heading text from source document for validation and template generation"
|
||||
},
|
||||
"x-markitect-content-instructions": {
|
||||
"type": "string",
|
||||
"description": "Instructions for content authors about what should go in this section"
|
||||
},
|
||||
"x-markitect-outline-mode": {
|
||||
"type": "boolean",
|
||||
"description": "Indicates if this schema was generated in outline mode, focusing on structural hierarchy"
|
||||
},
|
||||
"x-markitect-outline-depth": {
|
||||
"type": "integer",
|
||||
"minimum": 1,
|
||||
"description": "Maximum heading depth captured in outline mode"
|
||||
},
|
||||
"x-markitect-instruction-type": {
|
||||
"type": "string",
|
||||
"enum": ["description", "example", "constraint", "template"],
|
||||
"description": "Type of content instruction provided"
|
||||
},
|
||||
"x-markitect-generated-from": {
|
||||
"type": "string",
|
||||
"description": "Source file or document this schema was generated from"
|
||||
},
|
||||
"x-markitect-generation-mode": {
|
||||
"type": "string",
|
||||
"enum": ["outline", "full"],
|
||||
"description": "Mode used to generate this schema"
|
||||
}
|
||||
},
|
||||
"patternProperties": {
|
||||
"^x-markitect-": {
|
||||
"description": "MarkiTect extension properties"
|
||||
}
|
||||
}
|
||||
}
|
||||
]
|
||||
}
|
||||
366
tests/test_issue_51_outline_mode.py
Normal file
366
tests/test_issue_51_outline_mode.py
Normal file
@@ -0,0 +1,366 @@
|
||||
"""
|
||||
Tests for Issue #51: Add outline mode to schema generation
|
||||
|
||||
This test module implements comprehensive tests for the new outline mode functionality
|
||||
that captures document structure with actual heading text and depth control.
|
||||
|
||||
Following TDD8 methodology - these tests are written before implementation.
|
||||
"""
|
||||
|
||||
import json
|
||||
import pytest
|
||||
from pathlib import Path
|
||||
from tempfile import NamedTemporaryFile
|
||||
from click.testing import CliRunner
|
||||
|
||||
from markitect.cli import cli
|
||||
from markitect.schema_generator import SchemaGenerator
|
||||
from markitect.exceptions import InvalidDepthError
|
||||
|
||||
|
||||
class TestIssue51OutlineMode:
|
||||
"""Test suite for outline mode schema generation functionality."""
|
||||
|
||||
def setup_method(self):
|
||||
"""Set up test fixtures."""
|
||||
self.schema_generator = SchemaGenerator()
|
||||
self.runner = CliRunner()
|
||||
|
||||
def test_cli_accepts_mode_outline_option(self):
|
||||
"""Test that CLI accepts --mode outline option."""
|
||||
# Arrange
|
||||
markdown_content = """# Test Document
|
||||
|
||||
## Introduction
|
||||
This is a test document.
|
||||
|
||||
### Details
|
||||
Some details here.
|
||||
"""
|
||||
|
||||
with NamedTemporaryFile(mode='w', suffix='.md', delete=False) as f:
|
||||
f.write(markdown_content)
|
||||
temp_file = Path(f.name)
|
||||
|
||||
try:
|
||||
# Act
|
||||
result = self.runner.invoke(cli, [
|
||||
'schema-generate',
|
||||
'--mode', 'outline',
|
||||
str(temp_file)
|
||||
])
|
||||
|
||||
# Assert
|
||||
assert result.exit_code == 0, f"CLI should accept --mode outline option, got: {result.output}"
|
||||
|
||||
finally:
|
||||
temp_file.unlink()
|
||||
|
||||
def test_cli_accepts_depth_parameter(self):
|
||||
"""Test that CLI accepts --depth parameter with outline mode."""
|
||||
# Arrange
|
||||
markdown_content = """# Test Document
|
||||
|
||||
## Introduction
|
||||
This is a test document.
|
||||
|
||||
### Details
|
||||
Some details here.
|
||||
|
||||
#### Specifics
|
||||
Very specific information.
|
||||
"""
|
||||
|
||||
with NamedTemporaryFile(mode='w', suffix='.md', delete=False) as f:
|
||||
f.write(markdown_content)
|
||||
temp_file = Path(f.name)
|
||||
|
||||
try:
|
||||
# Act
|
||||
result = self.runner.invoke(cli, [
|
||||
'schema-generate',
|
||||
'--mode', 'outline',
|
||||
'--depth', '2',
|
||||
str(temp_file)
|
||||
])
|
||||
|
||||
# Assert
|
||||
assert result.exit_code == 0, f"CLI should accept --depth parameter, got: {result.output}"
|
||||
|
||||
finally:
|
||||
temp_file.unlink()
|
||||
|
||||
def test_outline_mode_generates_schema_with_from_title(self):
|
||||
"""Test that outline mode generates schema with 'from' in title instead of 'for'."""
|
||||
# Arrange
|
||||
markdown_content = """# Test Document
|
||||
|
||||
## Introduction
|
||||
This is a test document.
|
||||
"""
|
||||
|
||||
with NamedTemporaryFile(mode='w', suffix='.md', delete=False) as f:
|
||||
f.write(markdown_content)
|
||||
temp_file = Path(f.name)
|
||||
|
||||
try:
|
||||
# Act
|
||||
result = self.runner.invoke(cli, [
|
||||
'schema-generate',
|
||||
'--mode', 'outline',
|
||||
str(temp_file)
|
||||
])
|
||||
|
||||
# Assert
|
||||
assert result.exit_code == 0
|
||||
schema = json.loads(result.output)
|
||||
expected_title = f"Schema from {temp_file.name}"
|
||||
assert schema["title"] == expected_title, f"Expected title 'Schema from {temp_file.name}', got '{schema.get('title')}'"
|
||||
|
||||
finally:
|
||||
temp_file.unlink()
|
||||
|
||||
def test_outline_mode_captures_actual_heading_text(self):
|
||||
"""Test that outline mode captures actual heading text in schema."""
|
||||
# Arrange
|
||||
markdown_content = """# Main Architecture Document
|
||||
|
||||
## System Overview
|
||||
High-level system description.
|
||||
|
||||
### Core Components
|
||||
Details about main components.
|
||||
|
||||
## Implementation Strategy
|
||||
Strategy for implementation.
|
||||
"""
|
||||
|
||||
with NamedTemporaryFile(mode='w', suffix='.md', delete=False) as f:
|
||||
f.write(markdown_content)
|
||||
temp_file = Path(f.name)
|
||||
|
||||
try:
|
||||
# Act
|
||||
result = self.runner.invoke(cli, [
|
||||
'schema-generate',
|
||||
'--mode', 'outline',
|
||||
str(temp_file)
|
||||
])
|
||||
|
||||
# Assert
|
||||
assert result.exit_code == 0
|
||||
schema = json.loads(result.output)
|
||||
|
||||
# Check that headings properties exist and contain actual text
|
||||
assert "headings" in schema["properties"], "Schema should contain headings property"
|
||||
|
||||
# Should have level_1, level_2, level_3 based on content
|
||||
headings = schema["properties"]["headings"]["properties"]
|
||||
assert "level_1" in headings, "Should have level_1 headings"
|
||||
assert "level_2" in headings, "Should have level_2 headings"
|
||||
assert "level_3" in headings, "Should have level_3 headings"
|
||||
|
||||
# Check heading text is captured (this will need to be implemented)
|
||||
# For now, verify structure exists
|
||||
level_1_schema = headings["level_1"]
|
||||
assert level_1_schema["type"] == "array"
|
||||
assert "items" in level_1_schema
|
||||
|
||||
finally:
|
||||
temp_file.unlink()
|
||||
|
||||
def test_outline_mode_with_depth_limit_respects_depth(self):
|
||||
"""Test that outline mode with --depth parameter respects depth limit."""
|
||||
# Arrange
|
||||
markdown_content = """# Main Document
|
||||
|
||||
## Section A
|
||||
Content A.
|
||||
|
||||
### Subsection A1
|
||||
Content A1.
|
||||
|
||||
#### Deep Section A1.1
|
||||
Very deep content.
|
||||
|
||||
## Section B
|
||||
Content B.
|
||||
"""
|
||||
|
||||
with NamedTemporaryFile(mode='w', suffix='.md', delete=False) as f:
|
||||
f.write(markdown_content)
|
||||
temp_file = Path(f.name)
|
||||
|
||||
try:
|
||||
# Act
|
||||
result = self.runner.invoke(cli, [
|
||||
'schema-generate',
|
||||
'--mode', 'outline',
|
||||
'--depth', '2',
|
||||
str(temp_file)
|
||||
])
|
||||
|
||||
# Assert
|
||||
assert result.exit_code == 0
|
||||
schema = json.loads(result.output)
|
||||
|
||||
headings = schema["properties"]["headings"]["properties"]
|
||||
assert "level_1" in headings, "Should have level_1 headings"
|
||||
assert "level_2" in headings, "Should have level_2 headings"
|
||||
assert "level_3" not in headings, "Should not have level_3 headings with depth=2"
|
||||
assert "level_4" not in headings, "Should not have level_4 headings with depth=2"
|
||||
|
||||
finally:
|
||||
temp_file.unlink()
|
||||
|
||||
def test_outline_mode_integrates_with_metaschema_extensions(self):
|
||||
"""Test that outline mode integrates with metaschema extensions from Issue #50."""
|
||||
# Arrange
|
||||
markdown_content = """# Test Document
|
||||
|
||||
## Introduction
|
||||
This is a test document.
|
||||
"""
|
||||
|
||||
with NamedTemporaryFile(mode='w', suffix='.md', delete=False) as f:
|
||||
f.write(markdown_content)
|
||||
temp_file = Path(f.name)
|
||||
|
||||
try:
|
||||
# Act
|
||||
result = self.runner.invoke(cli, [
|
||||
'schema-generate',
|
||||
'--mode', 'outline',
|
||||
'--depth', '3',
|
||||
str(temp_file)
|
||||
])
|
||||
|
||||
# Assert
|
||||
assert result.exit_code == 0
|
||||
schema = json.loads(result.output)
|
||||
|
||||
# Check for metaschema extensions
|
||||
assert "x-markitect-outline-mode" in schema, "Should have outline mode marker"
|
||||
assert schema["x-markitect-outline-mode"] is True, "Outline mode should be marked as true"
|
||||
|
||||
assert "x-markitect-outline-depth" in schema, "Should have outline depth marker"
|
||||
assert schema["x-markitect-outline-depth"] == 3, "Should record the depth setting"
|
||||
|
||||
finally:
|
||||
temp_file.unlink()
|
||||
|
||||
def test_outline_mode_works_with_outfile_parameter(self):
|
||||
"""Test that outline mode works with existing --outfile parameter."""
|
||||
# Arrange
|
||||
markdown_content = """# Test Document
|
||||
|
||||
## Introduction
|
||||
This is a test document.
|
||||
"""
|
||||
|
||||
with NamedTemporaryFile(mode='w', suffix='.md', delete=False) as f:
|
||||
f.write(markdown_content)
|
||||
temp_file = Path(f.name)
|
||||
|
||||
with NamedTemporaryFile(mode='w', suffix='.json', delete=False) as outf:
|
||||
output_file = Path(outf.name)
|
||||
|
||||
try:
|
||||
# Act
|
||||
result = self.runner.invoke(cli, [
|
||||
'schema-generate',
|
||||
'--mode', 'outline',
|
||||
'--outfile', str(output_file),
|
||||
str(temp_file)
|
||||
])
|
||||
|
||||
# Assert
|
||||
assert result.exit_code == 0
|
||||
assert output_file.exists(), "Output file should be created"
|
||||
|
||||
schema_content = output_file.read_text()
|
||||
schema = json.loads(schema_content)
|
||||
|
||||
expected_title = f"Schema from {temp_file.name}"
|
||||
assert schema["title"] == expected_title
|
||||
|
||||
finally:
|
||||
temp_file.unlink()
|
||||
if output_file.exists():
|
||||
output_file.unlink()
|
||||
|
||||
def test_cli_maintains_backward_compatibility_with_max_depth(self):
|
||||
"""Test that existing --max-depth option still works with default mode."""
|
||||
# Arrange
|
||||
markdown_content = """# Test Document
|
||||
|
||||
## Introduction
|
||||
This is a test document.
|
||||
|
||||
### Details
|
||||
Some details here.
|
||||
"""
|
||||
|
||||
with NamedTemporaryFile(mode='w', suffix='.md', delete=False) as f:
|
||||
f.write(markdown_content)
|
||||
temp_file = Path(f.name)
|
||||
|
||||
try:
|
||||
# Act
|
||||
result = self.runner.invoke(cli, [
|
||||
'schema-generate',
|
||||
'--max-depth', '2',
|
||||
str(temp_file)
|
||||
])
|
||||
|
||||
# Assert
|
||||
assert result.exit_code == 0, f"CLI should maintain backward compatibility with --max-depth, got: {result.output}"
|
||||
schema = json.loads(result.output)
|
||||
|
||||
# Should use old title format for backward compatibility
|
||||
expected_title = f"Schema for {temp_file.name}"
|
||||
assert schema["title"] == expected_title, f"Default mode should use 'for' in title"
|
||||
|
||||
finally:
|
||||
temp_file.unlink()
|
||||
|
||||
def test_depth_parameter_validation(self):
|
||||
"""Test that --depth parameter validates input correctly."""
|
||||
# Arrange
|
||||
markdown_content = """# Test Document
|
||||
|
||||
## Introduction
|
||||
This is a test document.
|
||||
"""
|
||||
|
||||
with NamedTemporaryFile(mode='w', suffix='.md', delete=False) as f:
|
||||
f.write(markdown_content)
|
||||
temp_file = Path(f.name)
|
||||
|
||||
try:
|
||||
# Act - Test invalid depth
|
||||
result = self.runner.invoke(cli, [
|
||||
'schema-generate',
|
||||
'--mode', 'outline',
|
||||
'--depth', '0',
|
||||
str(temp_file)
|
||||
])
|
||||
|
||||
# Assert
|
||||
assert result.exit_code != 0, "Should reject depth=0"
|
||||
assert "Invalid depth parameter" in result.output or "depth must be >= 1" in result.output
|
||||
|
||||
finally:
|
||||
temp_file.unlink()
|
||||
|
||||
def test_cli_help_includes_new_options(self):
|
||||
"""Test that CLI help text includes documentation for new options."""
|
||||
# Act
|
||||
result = self.runner.invoke(cli, ['schema-generate', '--help'])
|
||||
|
||||
# Assert
|
||||
assert result.exit_code == 0
|
||||
help_text = result.output
|
||||
assert "--mode" in help_text, "Help should document --mode option"
|
||||
assert "--depth" in help_text, "Help should document --depth option"
|
||||
assert "outline" in help_text, "Help should mention outline mode"
|
||||
Reference in New Issue
Block a user