feat(schema): add semantic schema generation as default mode
Some checks failed
Test Suite / unit-tests (3.11) (push) Has been cancelled
Test Suite / unit-tests (3.12) (push) Has been cancelled
Test Suite / code-quality (push) Has been cancelled
Test Suite / security-scan (push) Has been cancelled
Test Suite / integration-tests (push) Has been cancelled
Test Suite / e2e-tests (push) Has been cancelled
Test Suite / performance-tests (push) Has been cancelled
Test Suite / test-summary (push) Has been cancelled

schema-generate now builds content-aware schemas from the document's
section hierarchy instead of counting markdown syntax elements. Detects
key-value tables, data tables, link lists, and mixed content patterns
to produce schemas that reflect the actual document outline.

Old behavior preserved via --mode syntactic. Validator and visualization
tools pinned to syntactic mode for compatibility.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-02-16 18:49:50 +01:00
parent 120ed89780
commit 60f33443ae
8 changed files with 408 additions and 55 deletions

View File

@@ -290,7 +290,7 @@ This is a test document.
output_file.unlink()
def test_cli_maintains_backward_compatibility_with_max_depth(self):
"""Test that existing --max-depth option still works with default mode."""
"""Test that existing --max-depth option still works with default (semantic) mode."""
# Arrange
markdown_content = """# Test Document
@@ -317,9 +317,9 @@ Some details here.
assert result.exit_code == 0, f"CLI should maintain backward compatibility with --max-depth, got: {result.output}"
schema = json.loads(result.output)
# Should use old title format for backward compatibility
expected_title = f"Schema for {temp_file.name}"
assert schema["title"] == expected_title, f"Default mode should use 'for' in title"
# Default mode is now semantic, which uses 'from' in title
expected_title = f"Schema from {temp_file.name}"
assert schema["title"] == expected_title, f"Default (semantic) mode should use 'from' in title"
finally:
temp_file.unlink()