.TH MARKITECT 1 "September 2025" "MarkiTect 1.0" "MarkiTect Manual" .SH NAME markitect \- high-performance markdown processing engine with AST caching .SH SYNOPSIS .B markitect [\fIOPTION\fR]... [\fICOMMAND\fR] [\fIFILE\fR]... .SH DESCRIPTION MarkiTect is a high-performance markdown processing engine that implements a "parse-once, manipulate-many" architecture with intelligent AST caching and database-first metadata management. The core innovation is that markdown files are parsed once and stored in multiple fast-access representations: JSON-serialized AST cache files, structured database metadata, and original content preservation. This enables complex document workflows without performance penalties. .SH COMMANDS .SS Document Processing .TP .B ingest \fIFILE\fR Ingest a markdown file into the MarkiTect system. Creates AST cache and stores metadata in database. Performance: Initial parse creates overhead, but subsequent cache loads are < 50% of parse time. .TP .B ingest-batch \fIDIRECTORY\fR Batch process all markdown files in a directory. Supports recursive processing with \fB--recursive\fR option. .TP .B status \fIFILE\fR Show processing status and cache information for a file. Displays parse time, cache time, and cache validity. .SS Cache Management .TP .B cache-info \fIFILE\fR Display detailed cache information including performance metrics. Shows cache hit/miss ratio and loading time statistics. .TP .B cache-invalidate \fIFILE\fR Force invalidation of AST cache for a file. Useful when manual cache refresh is needed. .TP .B cache-clean Remove all stale cache files based on source file modification times. Performs automatic cache maintenance. .SS Database Operations .TP .B query \fISQL\fR Execute SQL query against document metadata database. Enables relational operations on front matter data. .TP .B list List all ingested documents with metadata summary. Shows filename, title, modification time, and cache status. .TP .B show \fIFILE\fR Display complete metadata for a specific document. Includes front matter, processing times, and cache information. .TP .B export \fIFORMAT\fR Export document metadata in specified format (json, csv, yaml). Supports filtered exports with \fB--filter\fR option. .SS AST Operations .TP .B ast-dump \fIFILE\fR Output AST representation of a markdown file. Useful for debugging and analysis. Uses cached AST if available. .TP .B ast-query \fIFILE\fR \fIQUERY\fR Query AST structure using JSONPath expressions. Examples: $..[?@.type=='heading_open'], $..[?@.level==1] .TP .B ast-transform \fIFILE\fR \fISCRIPT\fR Apply transformation script to AST structure. Supports custom Python scripts for content modification. .SS Performance Analysis .TP .B benchmark \fIFILE\fR Run performance benchmark comparing parse vs cache load times. Validates the < 50% cache loading performance requirement. .TP .B profile \fIDIRECTORY\fR Generate performance profile for a collection of documents. Identifies performance bottlenecks and optimization opportunities. .SH OPTIONS .SS Global Options .TP .B \-\-cache-dir \fIDIRECTORY\fR Specify custom cache directory (default: .ast_cache) .TP .B \-\-database \fIFILE\fR Specify database file path (default: markitect.db) .TP .B \-\-verbose, \-v Enable verbose output with performance timing details .TP .B \-\-quiet, \-q Suppress non-essential output .TP .B \-\-config \fIFILE\fR Use custom configuration file .SS Processing Options .TP .B \-\-recursive, \-r Process directories recursively .TP .B \-\-force, \-f Force reprocessing even if cache is valid .TP .B \-\-validate Validate performance requirements during processing .TP .B \-\-no-cache Disable AST caching (parse every time) .SS Output Options .TP .B \-\-format \fIFORMAT\fR Output format: json, yaml, csv, table (default: table) .TP .B \-\-output \fIFILE\fR Write output to file instead of stdout .TP .B \-\-filter \fIEXPRESSION\fR Filter results using JSONPath expression .SH PERFORMANCE GUARANTEES MarkiTect provides documented performance contracts: .TP .B Cache Loading Time AST cache loading guaranteed to be < 50% of original markdown parsing time. This is validated by automated tests and can be verified with \fBmarkitect benchmark\fR. .TP .B Database Queries Metadata queries typically complete in sub-millisecond time for collections up to 10,000 documents. .TP .B Memory Usage Constant memory usage for cache operations regardless of document size. Memory scaling is linear with the number of documents processed simultaneously. .SH CONFIGURATION MarkiTect can be configured through: .TP .B Configuration File ~/.markitect/config.yaml or specified with \fB--config\fR option .TP .B Environment Variables .RS MARKITECT_CACHE_DIR - Default cache directory .br MARKITECT_DATABASE - Default database file .br MARKITECT_VALIDATE_PERFORMANCE - Enable automatic performance validation .RE .SH ARCHITECTURE .TP .B Parse-Once, Manipulate-Many Source files are parsed once to create multiple fast-access representations: .RS - AST Cache: JSON-serialized Abstract Syntax Tree .br - Database Metadata: Structured front matter and document metadata .br - Original Content: Preserved for integrity validation .RE .TP .B Intelligent Cache Invalidation Cache files are automatically invalidated based on source file modification times. No manual cache management required. .TP .B Database-First Metadata Front matter becomes queryable relational data with full SQL capabilities. Supports joins, aggregations, and complex filtering operations. .SH EXAMPLES .TP .B Basic Document Processing .nf # Ingest a single markdown file markitect ingest document.md # Process all markdown files in a directory markitect ingest-batch docs/ --recursive # Show processing status markitect status document.md .fi .TP .B Cache Operations .nf # Display cache information markitect cache-info document.md # Clean stale cache files markitect cache-clean # Force cache regeneration markitect cache-invalidate document.md --force .fi .TP .B Database Queries .nf # List all documents markitect list # Query by metadata markitect query "SELECT * FROM markdown_files WHERE json_extract(front_matter, '$.author') = 'John Doe'" # Export metadata markitect export json --output metadata.json .fi .TP .B AST Analysis .nf # Dump AST structure markitect ast-dump document.md --format json # Query for all headings markitect ast-query document.md "$..[?@.type=='heading_open']" # Find level 1 headings markitect ast-query document.md "$..[?@.level==1]" .fi .TP .B Performance Analysis .nf # Benchmark a single file markitect benchmark document.md # Profile a document collection markitect profile docs/ --recursive # Validate performance requirements markitect ingest document.md --validate .fi .SH EXIT STATUS .TP .B 0 Success .TP .B 1 General error (file not found, permission denied, etc.) .TP .B 2 Performance requirement violation (cache loading >= 50% of parse time) .TP .B 3 Database error (corruption, schema mismatch, etc.) .TP .B 4 Cache error (corruption, permission denied, etc.) .SH FILES .TP .B ~/.markitect/config.yaml User configuration file .TP .B .ast_cache/ Default AST cache directory .TP .B markitect.db Default SQLite database file .TP .B .markitect_workspace/ Workspace directory for development workflows .SH DIAGNOSTICS Common diagnostic commands: .TP .B Performance Issues .nf markitect benchmark problematic_file.md markitect profile slow_directory/ --verbose .fi .TP .B Cache Problems .nf markitect cache-info file.md markitect cache-clean --verbose .fi .TP .B Database Issues .nf markitect query "PRAGMA integrity_check" markitect list --validate .fi .SH BUGS Report bugs to: https://github.com/project/markitect/issues .SH SEE ALSO .BR markdown (1), .BR sqlite3 (1), .BR jq (1) .SH AUTHORS MarkiTect development team .SH COPYRIGHT Copyright (C) 2025 MarkiTect Project. This is free software; see the source for copying conditions.