Added comprehensive full text search capabilities as a lightweight plugin. Key features: - SQLite FTS5-based search engine with no external dependencies - Automatic indexing via database triggers for real-time updates - Advanced query support: phrase search, boolean operators, proximity search - Complete CLI interface with search commands - Graceful fallback to LIKE queries when FTS5 unavailable - Plugin architecture integration for extensibility CLI Commands: - `markitect search init` - Initialize search indexes - `markitect search query` - Perform full text searches - `markitect search status` - View index statistics - `markitect search rebuild` - Rebuild indexes from scratch Search Features: - Content type filtering (files, schemas, all) - Result pagination and formatting options - Query validation and syntax assistance - Performance optimization and index maintenance Technical Implementation: - FTSSearchPlugin: Main search plugin class - SearchIndexer: FTS5 table management and indexing - QueryParser: Query optimization and FTS5 syntax conversion - Comprehensive error handling and fallback mechanisms - 25 test cases covering all functionality Documentation includes complete usage guide and examples. Resolves issue #83: Full text search 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
6.6 KiB
Full Text Search - Issue #83
MarkiTect provides powerful full text search capabilities using SQLite's FTS5 extension, implemented as a lightweight plugin system.
Features
- SQLite FTS5: Leverages SQLite's built-in FTS5 virtual tables for high-performance search
- No Dependencies: Uses only SQLite, no additional search libraries required
- Real-time Indexing: Automatic index updates when content changes
- Advanced Queries: Support for phrase search, boolean operators, and proximity search
- CLI Integration: Complete command-line interface for search operations
- Fallback Support: Graceful degradation to simple LIKE queries if FTS5 unavailable
Quick Start
Initialize Search
First, initialize the search indexes:
markitect search init
This creates FTS5 virtual tables and sets up automatic indexing triggers.
Rebuild Indexes
To rebuild indexes from scratch:
markitect search rebuild --optimize
Check Status
View search system status:
markitect search status
Perform Searches
Search across all content:
markitect search query "API documentation"
Search only files:
markitect search query "graphql" --type files --limit 5
Search only schemas:
markitect search query "user" --type schemas
Query Syntax
Simple Queries
# Single word - automatically adds wildcard
markitect search query "api" # Finds: api, apis, apiKey, etc.
# Multiple words - implicit AND
markitect search query "api documentation" # Finds documents with both terms
Phrase Search
# Exact phrase matching
markitect search query '"GraphQL mutation"'
Boolean Operators
# AND operator
markitect search query "api AND documentation"
# OR operator
markitect search query "rest OR graphql"
# NOT operator
markitect search query "api NOT deprecated"
Advanced Features
# Proximity search (terms within 10 words)
markitect search query "NEAR(api documentation, 10)"
# Column-specific search
markitect search query "filename:readme"
CLI Commands
markitect search init
Initialize search indexes and FTS5 tables.
Options:
--rebuild- Rebuild existing indexes during initialization
Examples:
markitect search init
markitect search init --rebuild
markitect search query
Perform full text search queries.
Arguments:
QUERY- Search query string
Options:
--type [all|files|schemas]- Content type to search (default: all)--limit INTEGER- Maximum number of results (default: 20)--offset INTEGER- Result offset for pagination (default: 0)--format [table|json|yaml]- Output format (default: table)--no-highlight- Disable result highlighting
Examples:
markitect search query "documentation"
markitect search query "api" --type files --limit 10
markitect search query "schema" --format json
markitect search query "user" --offset 20 --limit 10 # Pagination
markitect search status
Show search index status and statistics.
Options:
--format [table|json|yaml]- Output format (default: table)
Examples:
markitect search status
markitect search status --format json
markitect search rebuild
Rebuild search indexes from scratch.
Options:
--optimize- Optimize indexes after rebuild
Examples:
markitect search rebuild
markitect search rebuild --optimize
Architecture
Plugin System
The search functionality is implemented as a plugin within MarkiTect's plugin architecture:
- FTSSearchPlugin: Main search plugin class
- SearchIndexer: Handles FTS5 table creation and maintenance
- QueryParser: Parses and optimizes search queries
Database Integration
- FTS5 Virtual Tables:
fts_filesandfts_schemasfor content indexing - Automatic Triggers: Database triggers keep indexes synchronized
- Fallback Queries: LIKE-based search when FTS5 unavailable
Search Process
- Indexing: Content automatically indexed via database triggers
- Query Parsing: User queries converted to FTS5-compatible syntax
- Search Execution: FTS5 performs ranked full text search
- Result Processing: Results formatted with highlighting and metadata
- Fallback: Simple LIKE queries if FTS5 fails
Performance Considerations
Index Optimization
# Periodically optimize indexes for better performance
markitect search rebuild --optimize
Query Performance
- Use specific content types (
--type files) when possible - Limit results with
--limitfor large result sets - Use phrase queries for exact matches
- Boolean operators are more efficient than complex natural language
Storage Impact
- FTS5 indexes require additional disk space (typically 30-50% of content size)
- Indexes are automatically maintained, no manual intervention needed
- Use
markitect search statusto monitor index sizes
Troubleshooting
FTS5 Not Available
If SQLite doesn't have FTS5 support:
markitect search status
# Shows: FTS5 Full Text Search: Disabled
The system automatically falls back to simple LIKE-based search.
Database Lock Errors
If you see database lock errors:
# Wait for other operations to complete, then retry
markitect search rebuild
Index Corruption
To fix corrupted indexes:
# Rebuild from scratch
markitect search rebuild --optimize
No Results Found
Check if content is indexed:
markitect search status
# Check document counts for fts_files and fts_schemas
If no documents are indexed:
markitect search rebuild
Integration with GraphQL
The search functionality integrates with MarkiTect's GraphQL interface through the existing search resolver, providing both FTS5-powered and fallback search capabilities through the GraphQL API.
Examples
Content Discovery
Find all API-related documentation:
markitect search query "api documentation" --limit 10
Schema Exploration
Find user-related schemas:
markitect search query "user" --type schemas --format json
Comprehensive Search
Search with pagination:
# First page
markitect search query "graphql" --limit 5 --offset 0
# Second page
markitect search query "graphql" --limit 5 --offset 5
Advanced Queries
Complex boolean search:
markitect search query "api AND (rest OR graphql) NOT deprecated"
Exact phrase with context:
markitect search query '"mutation resolver"' --type files
The full text search system provides powerful, lightweight search capabilities that scale with your MarkiTect content repository.