- Moved LLM_INTEGRATION_GAMEPLAN.md to history/ (strategic planning complete) - Moved IMPLEMENTATION_ISSUES.md to history/ (issues created in system) - Both documents served their purpose in planning and issue creation - Issues #100-109 now registered in MarkiTect issue management system - Ready for future development when LLM integration work begins 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
13 KiB
LLM Integration Gameplan - Issues #98 & #99
Date: 2025-10-03 Status: REQUIREMENTS ANALYSIS Priority: HIGH Estimated Effort: 4-6 weeks development
🎯 Executive Summary
Two complementary features that will transform MarkiTect from a content management system into an AI-powered knowledge assistant:
- Issue #98: OpenRoute Integration - Enable LLM queries against MarkiTect content
- Issue #99: Auto Fill Templates - LLM-powered interactive template completion
📋 Current State Analysis
✅ Existing Infrastructure (Ready to Leverage)
- Template System: Full template engine with parsing and rendering (
markitect/template/) - Configuration Manager: Extensible config system with CLI integration
- Query Paradigms: Natural Language paradigm exists (documented only)
- CLI Framework: Click-based with established patterns
- Database: SQLite with full metadata and content indexing
- FTS Search: Full text search capabilities for content discovery
🏗️ Infrastructure Gaps (Need Development)
- LLM Client: No OpenRouter integration exists
- Profile System: No user profile management
- Interactive UI: No terminal questionnaire system
- Context Building: No intelligent content selection for LLM queries
🚀 Issue #98: OpenRoute Integration
Requirements Analysis
Goal: "Use MarkiTect ingested content as context for interacting with LLMs flexibly and conveniently"
User Story: "As a user, I want to ask natural language questions about my content and get intelligent responses with source citations"
Integration: "Allow users to connect with an existing OpenRouter account"
Technical Implementation Plan
Phase 1: Core LLM Infrastructure (Week 1)
-
OpenRouter Client Development
# markitect/llm/openrouter_client.py class OpenRouterClient: - API key management - Model selection (GPT-4, Claude, etc.) - Request/response handling - Rate limiting and error handling - Cost tracking -
Configuration Integration
markitect config-set openrouter.api_key sk-or-... markitect config-set openrouter.default_model openai/gpt-4-turbo markitect config-show --show-sensitive # Show API keys -
Basic CLI Commands
markitect llm test # Test OpenRouter connection markitect llm models # List available models markitect llm ask "Simple question" # Basic LLM interaction
Phase 2: Content Context Integration (Week 2)
-
Context Builder System
# markitect/llm/context_builder.py class ContextBuilder: - Extract relevant content from database - Use FTS search for content discovery - Build context within token limits - Include metadata and relationships -
Enhanced Natural Language Paradigm
# Update markitect/query_paradigms/paradigms/natural_language_paradigm.py class NaturalLanguageQueryParadigm: - Integrate OpenRouter for real LLM processing - Build context from MarkiTect content - Return structured responses with citations -
Advanced CLI Integration
markitect paradigms exec "Natural Language" "What are the main API concepts?" markitect llm chat # Interactive mode markitect llm ask "Summarize docs tagged tutorial" # Filtered context
Phase 3: Advanced Features (Week 3)
-
Smart Context Selection
- Relevance scoring for content inclusion
- Context size optimization
- Source citation tracking
-
Response Enhancement
- Markdown formatting
- Source links back to MarkiTect files
- Follow-up question suggestions
Success Criteria
- ✅ OpenRouter integration working with API key configuration
- ✅ Natural language queries return relevant, contextualized responses
- ✅ Responses include source citations linking to MarkiTect files
- ✅ Context building intelligently selects relevant content
- ✅ CLI commands integrated with existing paradigm system
📝 Issue #99: Auto Fill Templates
Requirements Analysis
Goal: "Use Markdown Templates to capture data with terminal questionnaire and LLM auto-fill"
User Story: "As a user, I want to fill templates interactively, with the system auto-completing fields based on my profile"
LLM Integration: "Provided the user has a profile, an LLM should autofill based on the profile provided"
Technical Implementation Plan
Phase 1: Enhanced Template System (Week 1)
-
Template Field Analysis
# markitect/template/field_analyzer.py class TemplateFieldAnalyzer: - Parse template annotations: {{name:string:Your full name}} - Extract field types, descriptions, validation rules - Identify required vs optional fields - Support nested field structures -
Interactive Questionnaire Engine
# markitect/template/questionnaire.py class TemplateQuestionnaire: - Terminal-based interactive data collection - Support input types: text, choice, date, number, boolean - Field validation and re-prompting - Progress tracking and partial save -
Basic CLI Commands
markitect template-fill template.md # Interactive questionnaire markitect template-analyze template.md # Show template fields markitect template-validate template.md # Validate template syntax
Phase 2: User Profile System (Week 2)
-
Profile Management
# markitect/profile/manager.py class ProfileManager: - Create, read, update, delete profiles - Support multiple profiles (personal, work, etc.) - Profile inheritance and templates - Database storage integration -
Profile Schema System
# markitect/profile/schema.py - Standard profile fields (personal, professional, technical) - Custom field extensions - JSON Schema validation - Field type definitions and constraints -
Profile CLI Commands
markitect profile create personal markitect profile set personal.name "John Doe" markitect profile set personal.email "john@example.com" markitect profile show personal markitect profile list markitect profile export personal profile.json
Phase 3: LLM-Powered Auto-Fill (Week 3)
-
Smart Field Completion
# markitect/template/auto_filler.py class LLMAutoFiller: - Use OpenRouter LLM for field suggestions - Context-aware completions based on template purpose - Profile-informed field values - Learning from user corrections -
Advanced Template Fill Modes
markitect template-fill template.md --auto # Auto-fill from profile markitect template-fill template.md --guided # Mix auto + questions markitect template-fill template.md --profile=work # Use specific profile markitect template-fill template.md --learn # Learn from corrections
Phase 4: Advanced Features (Week 4)
-
Field Intelligence
- Template field learning and preferences
- Content generation for complex fields
- Multi-step form workflows
- Field dependencies and conditional logic
-
Integration Features
- Template field suggestions based on existing content
- Auto-population from MarkiTect database
- Template version control and updates
Success Criteria
- ✅ Interactive terminal questionnaire for template completion
- ✅ User profile system with multiple profile support
- ✅ LLM-powered auto-fill suggestions based on user profile
- ✅ Enhanced template parser supporting field metadata
- ✅ Seamless integration with existing template rendering system
🔗 Shared Infrastructure Requirements
Database Schema Extensions
-- User profiles table
CREATE TABLE user_profiles (
id INTEGER PRIMARY KEY,
name TEXT NOT NULL UNIQUE,
data JSON NOT NULL,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
-- LLM interaction logs (optional)
CREATE TABLE llm_interactions (
id INTEGER PRIMARY KEY,
query TEXT NOT NULL,
response TEXT NOT NULL,
model TEXT NOT NULL,
tokens_used INTEGER,
cost REAL,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
-- Template usage history
CREATE TABLE template_usage (
id INTEGER PRIMARY KEY,
template_path TEXT NOT NULL,
field_data JSON NOT NULL,
profile_used TEXT,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
Configuration Extensions
# .markitect.yml additions
openrouter:
api_key: "sk-or-..."
default_model: "openai/gpt-4-turbo"
max_tokens: 4096
temperature: 0.7
profiles:
default_profile: "personal"
auto_save: true
templates:
auto_fill_mode: "guided" # auto, interactive, guided
learn_from_corrections: true
📊 Implementation Priority Matrix
| Component | Issue | Priority | Effort | Dependencies |
|---|---|---|---|---|
| OpenRouter Client | #98 | HIGH | 2 days | Config system |
| Context Builder | #98 | HIGH | 3 days | FTS, Database |
| Profile Manager | #99 | HIGH | 2 days | Database |
| Template Field Parser | #99 | HIGH | 3 days | Template system |
| Interactive Questionnaire | #99 | MEDIUM | 4 days | Profile system |
| LLM Auto-Fill | #99 | MEDIUM | 4 days | OpenRouter, Profiles |
| Natural Language Enhancement | #98 | MEDIUM | 2 days | OpenRouter, Context |
| Advanced Context | #98 | LOW | 3 days | Basic LLM working |
🧪 Testing Strategy
Unit Tests Required
- OpenRouter client error handling and retries
- Template field parsing and validation
- Profile CRUD operations
- Context building with different content types
- LLM response formatting and citation extraction
Integration Tests Required
- End-to-end template filling workflow
- Natural language queries with MarkiTect context
- Profile-based auto-fill accuracy
- CLI command integration
Manual Testing Scenarios
- OpenRouter Setup: User configures API key and tests connection
- Template Creation: User creates template with various field types
- Profile Management: User creates and manages multiple profiles
- Interactive Fill: User completes template via questionnaire
- Auto-Fill: System suggests field values based on profile
- LLM Queries: User asks questions about their content
- Context Accuracy: Verify LLM responses cite correct sources
🎯 Success Metrics & KPIs
Quantitative Metrics
- Template Completion Time: Reduce by 60% with auto-fill
- Query Response Accuracy: >90% relevant context inclusion
- User Satisfaction: >8/10 rating for LLM responses
- Profile Usage: >75% of template fills use profile data
Qualitative Metrics
- User Experience: Seamless workflow integration
- Content Discovery: Users find value in LLM-powered content exploration
- Productivity: Templates become preferred method for document creation
- Accuracy: LLM suggestions match user intent and context
🚧 Risk Assessment & Mitigation
Technical Risks
- OpenRouter API Changes: Mitigate with versioned API client and error handling
- Token Limits: Implement intelligent context truncation and chunking
- LLM Response Quality: Add response validation and fallback mechanisms
- Performance: Cache common queries and optimize context building
User Experience Risks
- Complex Configuration: Provide setup wizard and clear documentation
- Learning Curve: Include examples and guided tutorials
- Profile Privacy: Implement secure storage and optional features
- Cost Concerns: Add usage tracking and budget controls
📝 Requirements Engineering Notes
FOR REQUIREMENTS ENGINEER:
-
User Research Needed:
- Survey existing MarkiTect users about LLM integration preferences
- Gather template usage patterns and pain points
- Validate profile data schema with target users
-
Technical Validation Required:
- Verify OpenRouter API capabilities and limitations
- Test LLM response quality with MarkiTect content types
- Validate template field parsing edge cases
-
Feature Prioritization:
- Consider implementing #98 first for immediate value
- #99 can follow as enhanced template experience
- Both share OpenRouter infrastructure investment
-
Alternative Approaches:
- Consider other LLM providers (Anthropic direct, Azure OpenAI)
- Evaluate local LLM options for privacy-conscious users
- Template auto-fill could work without LLM (rule-based initially)
-
Integration Points:
- Leverage existing Query Paradigm system for #98
- Build on solid template foundation for #99
- Utilize configuration manager for seamless setup
RECOMMENDATION: Proceed with implementation in phases, starting with OpenRouter client and basic LLM integration for #98, then expanding to template auto-fill for #99. The shared infrastructure investment will benefit both features significantly.