feat: add LinkValidator for semantic link validation (Phase 3)
Implement comprehensive link validation as part of semantic validation: Core Features: - Link classification: internal, external, fragment, email - Internal link validation: fragment anchors and file paths - External link validation: HTTP/HTTPS with configurable timeout - Email validation: mailto: link format checking - Fragment policy enforcement: allow/disallow fragment identifiers Link Validator: - markitect/validators/link_validator.py - Full link validation implementation - Supports x-markitect-content-control.link_validation configuration - Default: check internal links, skip external (fast) - Opt-in external checking with --check-links flag Integration: - Updated SemanticValidator to include link_result in reports - CLI already supports --check-links flag (line 1629 in cli.py) - Link validation runs by default for internal links (fast) - External link checking requires explicit --check-links flag Test Coverage: - Added 9 comprehensive tests for LinkValidator - Tests cover: classification, broken links, fragments, email, statistics - All 25 semantic validator tests passing (100%) Documentation: - Updated SCHEMA_MANAGEMENT_GUIDE.md with link validation section - Added examples for broken links and external link checking - Documented link types, validation rules, and configuration Statistics Tracking: - Links checked, internal/external/fragment/email counts - Detailed error/warning reporting with line numbers - Integration with existing semantic validation reporting 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
This commit is contained in:
@@ -177,6 +177,9 @@ markitect validate my-document.md --schema manpage-schema-v1.0.md
|
||||
# Only structural validation (classic mode)
|
||||
markitect validate my-document.md --schema schema.json --no-semantic
|
||||
|
||||
# With external link checking (may be slow)
|
||||
markitect validate my-document.md --schema manpage-schema-v1.0.md --check-links
|
||||
|
||||
# Strict mode (warnings become errors)
|
||||
markitect validate my-document.md --schema manpage-schema-v1.0.md --strict
|
||||
```
|
||||
@@ -202,6 +205,14 @@ markitect validate my-document.md --schema manpage-schema-v1.0.md --strict
|
||||
- **Quality Metrics**: Checks word counts, sentence counts
|
||||
- `min_words`, `max_words`: Word count requirements (WARNING)
|
||||
- `min_sentences`: Minimum sentence count (WARNING)
|
||||
- **Link Validation**: Validates internal and external links (optional)
|
||||
- Internal links: Checked by default when semantic validation enabled
|
||||
- Fragment links (#section-name) verified to exist (ERROR if broken)
|
||||
- Relative file paths checked for existence (ERROR if broken)
|
||||
- External links: Opt-in with --check-links flag (may be slow)
|
||||
- HTTP/HTTPS URLs validated with HEAD requests (WARNING if broken)
|
||||
- Email validation: Validates mailto: link format (WARNING if invalid)
|
||||
- Fragment policy: Configurable allow/disallow fragment identifiers
|
||||
|
||||
### Validation Output
|
||||
|
||||
@@ -222,6 +233,9 @@ Section Validation:
|
||||
Content Validation:
|
||||
✅ All content requirements met
|
||||
|
||||
Link Validation:
|
||||
✅ All 12 links valid
|
||||
|
||||
Summary:
|
||||
Sections checked: 3
|
||||
Sections found: 5
|
||||
@@ -271,6 +285,30 @@ $ markitect validate doc.md --schema manpage-schema-v1.0.md --strict
|
||||
Status: FAILED ❌ (warnings treated as errors)
|
||||
```
|
||||
|
||||
**Example 4: Broken Internal Link**
|
||||
```bash
|
||||
$ markitect validate doc.md --schema manpage-schema-v1.0.md
|
||||
|
||||
Link Validation:
|
||||
❌ #nonexistent-section - Internal link target not found: #nonexistent-section
|
||||
|
||||
Errors: 1
|
||||
Status: FAILED ❌
|
||||
```
|
||||
|
||||
**Example 5: External Link Validation**
|
||||
```bash
|
||||
# Enable external link checking (may be slow)
|
||||
$ markitect validate doc.md --schema manpage-schema-v1.0.md --check-links
|
||||
|
||||
Link Validation:
|
||||
✅ http://example.com - Valid
|
||||
⚠️ http://broken-link.invalid - External link unreachable: Name or service not known
|
||||
|
||||
Warnings: 1
|
||||
Status: PASSED ✅
|
||||
```
|
||||
|
||||
## Schema Naming Conventions
|
||||
|
||||
All schema filenames must follow this pattern:
|
||||
|
||||
Reference in New Issue
Block a user