feat: add LinkValidator for semantic link validation (Phase 3)

Implement comprehensive link validation as part of semantic validation:

Core Features:
- Link classification: internal, external, fragment, email
- Internal link validation: fragment anchors and file paths
- External link validation: HTTP/HTTPS with configurable timeout
- Email validation: mailto: link format checking
- Fragment policy enforcement: allow/disallow fragment identifiers

Link Validator:
- markitect/validators/link_validator.py - Full link validation implementation
- Supports x-markitect-content-control.link_validation configuration
- Default: check internal links, skip external (fast)
- Opt-in external checking with --check-links flag

Integration:
- Updated SemanticValidator to include link_result in reports
- CLI already supports --check-links flag (line 1629 in cli.py)
- Link validation runs by default for internal links (fast)
- External link checking requires explicit --check-links flag

Test Coverage:
- Added 9 comprehensive tests for LinkValidator
- Tests cover: classification, broken links, fragments, email, statistics
- All 25 semantic validator tests passing (100%)

Documentation:
- Updated SCHEMA_MANAGEMENT_GUIDE.md with link validation section
- Added examples for broken links and external link checking
- Documented link types, validation rules, and configuration

Statistics Tracking:
- Links checked, internal/external/fragment/email counts
- Detailed error/warning reporting with line numbers
- Integration with existing semantic validation reporting

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
This commit is contained in:
2026-01-06 03:41:03 +01:00
parent 0d78837a53
commit 20c0cfece7
5 changed files with 829 additions and 10 deletions

View File

@@ -22,6 +22,10 @@ from markitect.validators.content_validator import (
ContentValidator,
ContentValidationResult
)
from markitect.validators.link_validator import (
LinkValidator,
LinkValidationResult
)
@dataclass
@@ -33,7 +37,7 @@ class SemanticValidationReport:
"""
section_result: SectionValidationResult
content_result: Optional[ContentValidationResult] = None
link_result: Optional[Any] = None # LinkValidationResult when implemented
link_result: Optional[LinkValidationResult] = None
def has_errors(self) -> bool:
"""Check if there are any ERROR-level issues."""
@@ -99,6 +103,17 @@ class SemanticValidationReport:
else:
lines.append(" ✅ All content requirements met")
# Link validation
if self.link_result:
lines.append("")
lines.append("Link Validation:")
if self.link_result.issues:
for issue in self.link_result.issues:
status = "" if issue.severity == 'ERROR' else "⚠️"
lines.append(f" {status} {issue.link} - {issue.message}")
else:
lines.append(f" ✅ All {self.link_result.links_checked} links valid")
# Summary
lines.append("")
lines.append("Summary:")
@@ -112,6 +127,10 @@ class SemanticValidationReport:
all_errors.extend(self.content_result.get_errors())
all_warnings.extend(self.content_result.get_warnings())
if self.link_result:
all_errors.extend(self.link_result.get_errors())
all_warnings.extend(self.link_result.get_warnings())
lines.append(f" Errors: {len(all_errors)}")
lines.append(f" Warnings: {len(all_warnings)}")
@@ -155,9 +174,7 @@ class SemanticValidator:
# Initialize sub-validators
self.section_validator = SectionValidator(schema)
self.content_validator = ContentValidator(schema)
# TODO: Initialize link validator when implemented
# self.link_validator = LinkValidator(schema)
self.link_validator = LinkValidator(schema)
def validate(self, document_path: str | Path,
check_links: bool = False) -> SemanticValidationReport:
@@ -189,12 +206,12 @@ class SemanticValidator:
# Run content validation
content_result = self.content_validator.check(document)
# TODO: Run link validation when implemented
# if check_links:
# link_result = self.link_validator.check(document)
# else:
# link_result = None
link_result = None
# Run link validation (if enabled)
if check_links:
link_result = self.link_validator.check(document, check_external=True)
else:
# Still check internal links by default (fast)
link_result = self.link_validator.check(document, check_external=False)
return SemanticValidationReport(
section_result=section_result,