# Full Text Search - Issue #83 MarkiTect provides powerful full text search capabilities using SQLite's FTS5 extension, implemented as a lightweight plugin system. ## Features - **SQLite FTS5**: Leverages SQLite's built-in FTS5 virtual tables for high-performance search - **No Dependencies**: Uses only SQLite, no additional search libraries required - **Real-time Indexing**: Automatic index updates when content changes - **Advanced Queries**: Support for phrase search, boolean operators, and proximity search - **CLI Integration**: Complete command-line interface for search operations - **Fallback Support**: Graceful degradation to simple LIKE queries if FTS5 unavailable ## Quick Start ### Initialize Search First, initialize the search indexes: ```bash markitect search init ``` This creates FTS5 virtual tables and sets up automatic indexing triggers. ### Rebuild Indexes To rebuild indexes from scratch: ```bash markitect search rebuild --optimize ``` ### Check Status View search system status: ```bash markitect search status ``` ### Perform Searches Search across all content: ```bash markitect search query "API documentation" ``` Search only files: ```bash markitect search query "graphql" --type files --limit 5 ``` Search only schemas: ```bash markitect search query "user" --type schemas ``` ## Query Syntax ### Simple Queries ```bash # Single word - automatically adds wildcard markitect search query "api" # Finds: api, apis, apiKey, etc. # Multiple words - implicit AND markitect search query "api documentation" # Finds documents with both terms ``` ### Phrase Search ```bash # Exact phrase matching markitect search query '"GraphQL mutation"' ``` ### Boolean Operators ```bash # AND operator markitect search query "api AND documentation" # OR operator markitect search query "rest OR graphql" # NOT operator markitect search query "api NOT deprecated" ``` ### Advanced Features ```bash # Proximity search (terms within 10 words) markitect search query "NEAR(api documentation, 10)" # Column-specific search markitect search query "filename:readme" ``` ## CLI Commands ### `markitect search init` Initialize search indexes and FTS5 tables. **Options:** - `--rebuild` - Rebuild existing indexes during initialization **Examples:** ```bash markitect search init markitect search init --rebuild ``` ### `markitect search query` Perform full text search queries. **Arguments:** - `QUERY` - Search query string **Options:** - `--type [all|files|schemas]` - Content type to search (default: all) - `--limit INTEGER` - Maximum number of results (default: 20) - `--offset INTEGER` - Result offset for pagination (default: 0) - `--format [table|json|yaml]` - Output format (default: table) - `--no-highlight` - Disable result highlighting **Examples:** ```bash markitect search query "documentation" markitect search query "api" --type files --limit 10 markitect search query "schema" --format json markitect search query "user" --offset 20 --limit 10 # Pagination ``` ### `markitect search status` Show search index status and statistics. **Options:** - `--format [table|json|yaml]` - Output format (default: table) **Examples:** ```bash markitect search status markitect search status --format json ``` ### `markitect search rebuild` Rebuild search indexes from scratch. **Options:** - `--optimize` - Optimize indexes after rebuild **Examples:** ```bash markitect search rebuild markitect search rebuild --optimize ``` ## Architecture ### Plugin System The search functionality is implemented as a plugin within MarkiTect's plugin architecture: - **FTSSearchPlugin**: Main search plugin class - **SearchIndexer**: Handles FTS5 table creation and maintenance - **QueryParser**: Parses and optimizes search queries ### Database Integration - **FTS5 Virtual Tables**: `fts_files` and `fts_schemas` for content indexing - **Automatic Triggers**: Database triggers keep indexes synchronized - **Fallback Queries**: LIKE-based search when FTS5 unavailable ### Search Process 1. **Indexing**: Content automatically indexed via database triggers 2. **Query Parsing**: User queries converted to FTS5-compatible syntax 3. **Search Execution**: FTS5 performs ranked full text search 4. **Result Processing**: Results formatted with highlighting and metadata 5. **Fallback**: Simple LIKE queries if FTS5 fails ## Performance Considerations ### Index Optimization ```bash # Periodically optimize indexes for better performance markitect search rebuild --optimize ``` ### Query Performance - Use specific content types (`--type files`) when possible - Limit results with `--limit` for large result sets - Use phrase queries for exact matches - Boolean operators are more efficient than complex natural language ### Storage Impact - FTS5 indexes require additional disk space (typically 30-50% of content size) - Indexes are automatically maintained, no manual intervention needed - Use `markitect search status` to monitor index sizes ## Troubleshooting ### FTS5 Not Available If SQLite doesn't have FTS5 support: ```bash markitect search status # Shows: FTS5 Full Text Search: Disabled ``` The system automatically falls back to simple LIKE-based search. ### Database Lock Errors If you see database lock errors: ```bash # Wait for other operations to complete, then retry markitect search rebuild ``` ### Index Corruption To fix corrupted indexes: ```bash # Rebuild from scratch markitect search rebuild --optimize ``` ### No Results Found Check if content is indexed: ```bash markitect search status # Check document counts for fts_files and fts_schemas ``` If no documents are indexed: ```bash markitect search rebuild ``` ## Integration with GraphQL The search functionality integrates with MarkiTect's GraphQL interface through the existing search resolver, providing both FTS5-powered and fallback search capabilities through the GraphQL API. ## Examples ### Content Discovery Find all API-related documentation: ```bash markitect search query "api documentation" --limit 10 ``` ### Schema Exploration Find user-related schemas: ```bash markitect search query "user" --type schemas --format json ``` ### Comprehensive Search Search with pagination: ```bash # First page markitect search query "graphql" --limit 5 --offset 0 # Second page markitect search query "graphql" --limit 5 --offset 5 ``` ### Advanced Queries Complex boolean search: ```bash markitect search query "api AND (rest OR graphql) NOT deprecated" ``` Exact phrase with context: ```bash markitect search query '"mutation resolver"' --type files ``` The full text search system provides powerful, lightweight search capabilities that scale with your MarkiTect content repository.