- Moved LLM_INTEGRATION_GAMEPLAN.md to history/ (strategic planning complete) - Moved IMPLEMENTATION_ISSUES.md to history/ (issues created in system) - Both documents served their purpose in planning and issue creation - Issues #100-109 now registered in MarkiTect issue management system - Ready for future development when LLM integration work begins 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
366 lines
13 KiB
Markdown
366 lines
13 KiB
Markdown
# LLM Integration Gameplan - Issues #98 & #99
|
|
|
|
**Date**: 2025-10-03
|
|
**Status**: REQUIREMENTS ANALYSIS
|
|
**Priority**: HIGH
|
|
**Estimated Effort**: 4-6 weeks development
|
|
|
|
## 🎯 Executive Summary
|
|
|
|
Two complementary features that will transform MarkiTect from a content management system into an **AI-powered knowledge assistant**:
|
|
|
|
- **Issue #98**: OpenRoute Integration - Enable LLM queries against MarkiTect content
|
|
- **Issue #99**: Auto Fill Templates - LLM-powered interactive template completion
|
|
|
|
## 📋 Current State Analysis
|
|
|
|
### ✅ Existing Infrastructure (Ready to Leverage)
|
|
- **Template System**: Full template engine with parsing and rendering (`markitect/template/`)
|
|
- **Configuration Manager**: Extensible config system with CLI integration
|
|
- **Query Paradigms**: Natural Language paradigm exists (documented only)
|
|
- **CLI Framework**: Click-based with established patterns
|
|
- **Database**: SQLite with full metadata and content indexing
|
|
- **FTS Search**: Full text search capabilities for content discovery
|
|
|
|
### 🏗️ Infrastructure Gaps (Need Development)
|
|
- **LLM Client**: No OpenRouter integration exists
|
|
- **Profile System**: No user profile management
|
|
- **Interactive UI**: No terminal questionnaire system
|
|
- **Context Building**: No intelligent content selection for LLM queries
|
|
|
|
## 🚀 Issue #98: OpenRoute Integration
|
|
|
|
### Requirements Analysis
|
|
```yaml
|
|
Goal: "Use MarkiTect ingested content as context for interacting with LLMs flexibly and conveniently"
|
|
User Story: "As a user, I want to ask natural language questions about my content and get intelligent responses with source citations"
|
|
Integration: "Allow users to connect with an existing OpenRouter account"
|
|
```
|
|
|
|
### Technical Implementation Plan
|
|
|
|
#### Phase 1: Core LLM Infrastructure (Week 1)
|
|
1. **OpenRouter Client Development**
|
|
```python
|
|
# markitect/llm/openrouter_client.py
|
|
class OpenRouterClient:
|
|
- API key management
|
|
- Model selection (GPT-4, Claude, etc.)
|
|
- Request/response handling
|
|
- Rate limiting and error handling
|
|
- Cost tracking
|
|
```
|
|
|
|
2. **Configuration Integration**
|
|
```bash
|
|
markitect config-set openrouter.api_key sk-or-...
|
|
markitect config-set openrouter.default_model openai/gpt-4-turbo
|
|
markitect config-show --show-sensitive # Show API keys
|
|
```
|
|
|
|
3. **Basic CLI Commands**
|
|
```bash
|
|
markitect llm test # Test OpenRouter connection
|
|
markitect llm models # List available models
|
|
markitect llm ask "Simple question" # Basic LLM interaction
|
|
```
|
|
|
|
#### Phase 2: Content Context Integration (Week 2)
|
|
4. **Context Builder System**
|
|
```python
|
|
# markitect/llm/context_builder.py
|
|
class ContextBuilder:
|
|
- Extract relevant content from database
|
|
- Use FTS search for content discovery
|
|
- Build context within token limits
|
|
- Include metadata and relationships
|
|
```
|
|
|
|
5. **Enhanced Natural Language Paradigm**
|
|
```python
|
|
# Update markitect/query_paradigms/paradigms/natural_language_paradigm.py
|
|
class NaturalLanguageQueryParadigm:
|
|
- Integrate OpenRouter for real LLM processing
|
|
- Build context from MarkiTect content
|
|
- Return structured responses with citations
|
|
```
|
|
|
|
6. **Advanced CLI Integration**
|
|
```bash
|
|
markitect paradigms exec "Natural Language" "What are the main API concepts?"
|
|
markitect llm chat # Interactive mode
|
|
markitect llm ask "Summarize docs tagged tutorial" # Filtered context
|
|
```
|
|
|
|
#### Phase 3: Advanced Features (Week 3)
|
|
7. **Smart Context Selection**
|
|
- Relevance scoring for content inclusion
|
|
- Context size optimization
|
|
- Source citation tracking
|
|
|
|
8. **Response Enhancement**
|
|
- Markdown formatting
|
|
- Source links back to MarkiTect files
|
|
- Follow-up question suggestions
|
|
|
|
### Success Criteria
|
|
- ✅ OpenRouter integration working with API key configuration
|
|
- ✅ Natural language queries return relevant, contextualized responses
|
|
- ✅ Responses include source citations linking to MarkiTect files
|
|
- ✅ Context building intelligently selects relevant content
|
|
- ✅ CLI commands integrated with existing paradigm system
|
|
|
|
## 📝 Issue #99: Auto Fill Templates
|
|
|
|
### Requirements Analysis
|
|
```yaml
|
|
Goal: "Use Markdown Templates to capture data with terminal questionnaire and LLM auto-fill"
|
|
User Story: "As a user, I want to fill templates interactively, with the system auto-completing fields based on my profile"
|
|
LLM Integration: "Provided the user has a profile, an LLM should autofill based on the profile provided"
|
|
```
|
|
|
|
### Technical Implementation Plan
|
|
|
|
#### Phase 1: Enhanced Template System (Week 1)
|
|
1. **Template Field Analysis**
|
|
```python
|
|
# markitect/template/field_analyzer.py
|
|
class TemplateFieldAnalyzer:
|
|
- Parse template annotations: {{name:string:Your full name}}
|
|
- Extract field types, descriptions, validation rules
|
|
- Identify required vs optional fields
|
|
- Support nested field structures
|
|
```
|
|
|
|
2. **Interactive Questionnaire Engine**
|
|
```python
|
|
# markitect/template/questionnaire.py
|
|
class TemplateQuestionnaire:
|
|
- Terminal-based interactive data collection
|
|
- Support input types: text, choice, date, number, boolean
|
|
- Field validation and re-prompting
|
|
- Progress tracking and partial save
|
|
```
|
|
|
|
3. **Basic CLI Commands**
|
|
```bash
|
|
markitect template-fill template.md # Interactive questionnaire
|
|
markitect template-analyze template.md # Show template fields
|
|
markitect template-validate template.md # Validate template syntax
|
|
```
|
|
|
|
#### Phase 2: User Profile System (Week 2)
|
|
4. **Profile Management**
|
|
```python
|
|
# markitect/profile/manager.py
|
|
class ProfileManager:
|
|
- Create, read, update, delete profiles
|
|
- Support multiple profiles (personal, work, etc.)
|
|
- Profile inheritance and templates
|
|
- Database storage integration
|
|
```
|
|
|
|
5. **Profile Schema System**
|
|
```python
|
|
# markitect/profile/schema.py
|
|
- Standard profile fields (personal, professional, technical)
|
|
- Custom field extensions
|
|
- JSON Schema validation
|
|
- Field type definitions and constraints
|
|
```
|
|
|
|
6. **Profile CLI Commands**
|
|
```bash
|
|
markitect profile create personal
|
|
markitect profile set personal.name "John Doe"
|
|
markitect profile set personal.email "john@example.com"
|
|
markitect profile show personal
|
|
markitect profile list
|
|
markitect profile export personal profile.json
|
|
```
|
|
|
|
#### Phase 3: LLM-Powered Auto-Fill (Week 3)
|
|
7. **Smart Field Completion**
|
|
```python
|
|
# markitect/template/auto_filler.py
|
|
class LLMAutoFiller:
|
|
- Use OpenRouter LLM for field suggestions
|
|
- Context-aware completions based on template purpose
|
|
- Profile-informed field values
|
|
- Learning from user corrections
|
|
```
|
|
|
|
8. **Advanced Template Fill Modes**
|
|
```bash
|
|
markitect template-fill template.md --auto # Auto-fill from profile
|
|
markitect template-fill template.md --guided # Mix auto + questions
|
|
markitect template-fill template.md --profile=work # Use specific profile
|
|
markitect template-fill template.md --learn # Learn from corrections
|
|
```
|
|
|
|
#### Phase 4: Advanced Features (Week 4)
|
|
9. **Field Intelligence**
|
|
- Template field learning and preferences
|
|
- Content generation for complex fields
|
|
- Multi-step form workflows
|
|
- Field dependencies and conditional logic
|
|
|
|
10. **Integration Features**
|
|
- Template field suggestions based on existing content
|
|
- Auto-population from MarkiTect database
|
|
- Template version control and updates
|
|
|
|
### Success Criteria
|
|
- ✅ Interactive terminal questionnaire for template completion
|
|
- ✅ User profile system with multiple profile support
|
|
- ✅ LLM-powered auto-fill suggestions based on user profile
|
|
- ✅ Enhanced template parser supporting field metadata
|
|
- ✅ Seamless integration with existing template rendering system
|
|
|
|
## 🔗 Shared Infrastructure Requirements
|
|
|
|
### Database Schema Extensions
|
|
```sql
|
|
-- User profiles table
|
|
CREATE TABLE user_profiles (
|
|
id INTEGER PRIMARY KEY,
|
|
name TEXT NOT NULL UNIQUE,
|
|
data JSON NOT NULL,
|
|
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
|
|
updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
|
|
);
|
|
|
|
-- LLM interaction logs (optional)
|
|
CREATE TABLE llm_interactions (
|
|
id INTEGER PRIMARY KEY,
|
|
query TEXT NOT NULL,
|
|
response TEXT NOT NULL,
|
|
model TEXT NOT NULL,
|
|
tokens_used INTEGER,
|
|
cost REAL,
|
|
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
|
|
);
|
|
|
|
-- Template usage history
|
|
CREATE TABLE template_usage (
|
|
id INTEGER PRIMARY KEY,
|
|
template_path TEXT NOT NULL,
|
|
field_data JSON NOT NULL,
|
|
profile_used TEXT,
|
|
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
|
|
);
|
|
```
|
|
|
|
### Configuration Extensions
|
|
```yaml
|
|
# .markitect.yml additions
|
|
openrouter:
|
|
api_key: "sk-or-..."
|
|
default_model: "openai/gpt-4-turbo"
|
|
max_tokens: 4096
|
|
temperature: 0.7
|
|
|
|
profiles:
|
|
default_profile: "personal"
|
|
auto_save: true
|
|
|
|
templates:
|
|
auto_fill_mode: "guided" # auto, interactive, guided
|
|
learn_from_corrections: true
|
|
```
|
|
|
|
## 📊 Implementation Priority Matrix
|
|
|
|
| Component | Issue | Priority | Effort | Dependencies |
|
|
|-----------|-------|----------|--------|--------------|
|
|
| OpenRouter Client | #98 | HIGH | 2 days | Config system |
|
|
| Context Builder | #98 | HIGH | 3 days | FTS, Database |
|
|
| Profile Manager | #99 | HIGH | 2 days | Database |
|
|
| Template Field Parser | #99 | HIGH | 3 days | Template system |
|
|
| Interactive Questionnaire | #99 | MEDIUM | 4 days | Profile system |
|
|
| LLM Auto-Fill | #99 | MEDIUM | 4 days | OpenRouter, Profiles |
|
|
| Natural Language Enhancement | #98 | MEDIUM | 2 days | OpenRouter, Context |
|
|
| Advanced Context | #98 | LOW | 3 days | Basic LLM working |
|
|
|
|
## 🧪 Testing Strategy
|
|
|
|
### Unit Tests Required
|
|
- OpenRouter client error handling and retries
|
|
- Template field parsing and validation
|
|
- Profile CRUD operations
|
|
- Context building with different content types
|
|
- LLM response formatting and citation extraction
|
|
|
|
### Integration Tests Required
|
|
- End-to-end template filling workflow
|
|
- Natural language queries with MarkiTect context
|
|
- Profile-based auto-fill accuracy
|
|
- CLI command integration
|
|
|
|
### Manual Testing Scenarios
|
|
1. **OpenRouter Setup**: User configures API key and tests connection
|
|
2. **Template Creation**: User creates template with various field types
|
|
3. **Profile Management**: User creates and manages multiple profiles
|
|
4. **Interactive Fill**: User completes template via questionnaire
|
|
5. **Auto-Fill**: System suggests field values based on profile
|
|
6. **LLM Queries**: User asks questions about their content
|
|
7. **Context Accuracy**: Verify LLM responses cite correct sources
|
|
|
|
## 🎯 Success Metrics & KPIs
|
|
|
|
### Quantitative Metrics
|
|
- **Template Completion Time**: Reduce by 60% with auto-fill
|
|
- **Query Response Accuracy**: >90% relevant context inclusion
|
|
- **User Satisfaction**: >8/10 rating for LLM responses
|
|
- **Profile Usage**: >75% of template fills use profile data
|
|
|
|
### Qualitative Metrics
|
|
- **User Experience**: Seamless workflow integration
|
|
- **Content Discovery**: Users find value in LLM-powered content exploration
|
|
- **Productivity**: Templates become preferred method for document creation
|
|
- **Accuracy**: LLM suggestions match user intent and context
|
|
|
|
## 🚧 Risk Assessment & Mitigation
|
|
|
|
### Technical Risks
|
|
1. **OpenRouter API Changes**: Mitigate with versioned API client and error handling
|
|
2. **Token Limits**: Implement intelligent context truncation and chunking
|
|
3. **LLM Response Quality**: Add response validation and fallback mechanisms
|
|
4. **Performance**: Cache common queries and optimize context building
|
|
|
|
### User Experience Risks
|
|
1. **Complex Configuration**: Provide setup wizard and clear documentation
|
|
2. **Learning Curve**: Include examples and guided tutorials
|
|
3. **Profile Privacy**: Implement secure storage and optional features
|
|
4. **Cost Concerns**: Add usage tracking and budget controls
|
|
|
|
## 📝 Requirements Engineering Notes
|
|
|
|
**FOR REQUIREMENTS ENGINEER:**
|
|
|
|
1. **User Research Needed**:
|
|
- Survey existing MarkiTect users about LLM integration preferences
|
|
- Gather template usage patterns and pain points
|
|
- Validate profile data schema with target users
|
|
|
|
2. **Technical Validation Required**:
|
|
- Verify OpenRouter API capabilities and limitations
|
|
- Test LLM response quality with MarkiTect content types
|
|
- Validate template field parsing edge cases
|
|
|
|
3. **Feature Prioritization**:
|
|
- Consider implementing #98 first for immediate value
|
|
- #99 can follow as enhanced template experience
|
|
- Both share OpenRouter infrastructure investment
|
|
|
|
4. **Alternative Approaches**:
|
|
- Consider other LLM providers (Anthropic direct, Azure OpenAI)
|
|
- Evaluate local LLM options for privacy-conscious users
|
|
- Template auto-fill could work without LLM (rule-based initially)
|
|
|
|
5. **Integration Points**:
|
|
- Leverage existing Query Paradigm system for #98
|
|
- Build on solid template foundation for #99
|
|
- Utilize configuration manager for seamless setup
|
|
|
|
**RECOMMENDATION**: Proceed with implementation in phases, starting with OpenRouter client and basic LLM integration for #98, then expanding to template auto-fill for #99. The shared infrastructure investment will benefit both features significantly. |