Files
markitect-main/history/LLM_INTEGRATION_GAMEPLAN.md
tegwick 371412bcbb docs: move LLM integration planning documents to history
- Moved LLM_INTEGRATION_GAMEPLAN.md to history/ (strategic planning complete)
- Moved IMPLEMENTATION_ISSUES.md to history/ (issues created in system)
- Both documents served their purpose in planning and issue creation
- Issues #100-109 now registered in MarkiTect issue management system
- Ready for future development when LLM integration work begins

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-04 00:29:51 +02:00

13 KiB

LLM Integration Gameplan - Issues #98 & #99

Date: 2025-10-03 Status: REQUIREMENTS ANALYSIS Priority: HIGH Estimated Effort: 4-6 weeks development

🎯 Executive Summary

Two complementary features that will transform MarkiTect from a content management system into an AI-powered knowledge assistant:

  • Issue #98: OpenRoute Integration - Enable LLM queries against MarkiTect content
  • Issue #99: Auto Fill Templates - LLM-powered interactive template completion

📋 Current State Analysis

Existing Infrastructure (Ready to Leverage)

  • Template System: Full template engine with parsing and rendering (markitect/template/)
  • Configuration Manager: Extensible config system with CLI integration
  • Query Paradigms: Natural Language paradigm exists (documented only)
  • CLI Framework: Click-based with established patterns
  • Database: SQLite with full metadata and content indexing
  • FTS Search: Full text search capabilities for content discovery

🏗️ Infrastructure Gaps (Need Development)

  • LLM Client: No OpenRouter integration exists
  • Profile System: No user profile management
  • Interactive UI: No terminal questionnaire system
  • Context Building: No intelligent content selection for LLM queries

🚀 Issue #98: OpenRoute Integration

Requirements Analysis

Goal: "Use MarkiTect ingested content as context for interacting with LLMs flexibly and conveniently"
User Story: "As a user, I want to ask natural language questions about my content and get intelligent responses with source citations"
Integration: "Allow users to connect with an existing OpenRouter account"

Technical Implementation Plan

Phase 1: Core LLM Infrastructure (Week 1)

  1. OpenRouter Client Development

    # markitect/llm/openrouter_client.py
    class OpenRouterClient:
        - API key management
        - Model selection (GPT-4, Claude, etc.)
        - Request/response handling
        - Rate limiting and error handling
        - Cost tracking
    
  2. Configuration Integration

    markitect config-set openrouter.api_key sk-or-...
    markitect config-set openrouter.default_model openai/gpt-4-turbo
    markitect config-show --show-sensitive  # Show API keys
    
  3. Basic CLI Commands

    markitect llm test                    # Test OpenRouter connection
    markitect llm models                  # List available models
    markitect llm ask "Simple question"   # Basic LLM interaction
    

Phase 2: Content Context Integration (Week 2)

  1. Context Builder System

    # markitect/llm/context_builder.py
    class ContextBuilder:
        - Extract relevant content from database
        - Use FTS search for content discovery
        - Build context within token limits
        - Include metadata and relationships
    
  2. Enhanced Natural Language Paradigm

    # Update markitect/query_paradigms/paradigms/natural_language_paradigm.py
    class NaturalLanguageQueryParadigm:
        - Integrate OpenRouter for real LLM processing
        - Build context from MarkiTect content
        - Return structured responses with citations
    
  3. Advanced CLI Integration

    markitect paradigms exec "Natural Language" "What are the main API concepts?"
    markitect llm chat                                    # Interactive mode
    markitect llm ask "Summarize docs tagged tutorial"   # Filtered context
    

Phase 3: Advanced Features (Week 3)

  1. Smart Context Selection

    • Relevance scoring for content inclusion
    • Context size optimization
    • Source citation tracking
  2. Response Enhancement

    • Markdown formatting
    • Source links back to MarkiTect files
    • Follow-up question suggestions

Success Criteria

  • OpenRouter integration working with API key configuration
  • Natural language queries return relevant, contextualized responses
  • Responses include source citations linking to MarkiTect files
  • Context building intelligently selects relevant content
  • CLI commands integrated with existing paradigm system

📝 Issue #99: Auto Fill Templates

Requirements Analysis

Goal: "Use Markdown Templates to capture data with terminal questionnaire and LLM auto-fill"
User Story: "As a user, I want to fill templates interactively, with the system auto-completing fields based on my profile"
LLM Integration: "Provided the user has a profile, an LLM should autofill based on the profile provided"

Technical Implementation Plan

Phase 1: Enhanced Template System (Week 1)

  1. Template Field Analysis

    # markitect/template/field_analyzer.py
    class TemplateFieldAnalyzer:
        - Parse template annotations: {{name:string:Your full name}}
        - Extract field types, descriptions, validation rules
        - Identify required vs optional fields
        - Support nested field structures
    
  2. Interactive Questionnaire Engine

    # markitect/template/questionnaire.py
    class TemplateQuestionnaire:
        - Terminal-based interactive data collection
        - Support input types: text, choice, date, number, boolean
        - Field validation and re-prompting
        - Progress tracking and partial save
    
  3. Basic CLI Commands

    markitect template-fill template.md            # Interactive questionnaire
    markitect template-analyze template.md         # Show template fields
    markitect template-validate template.md        # Validate template syntax
    

Phase 2: User Profile System (Week 2)

  1. Profile Management

    # markitect/profile/manager.py
    class ProfileManager:
        - Create, read, update, delete profiles
        - Support multiple profiles (personal, work, etc.)
        - Profile inheritance and templates
        - Database storage integration
    
  2. Profile Schema System

    # markitect/profile/schema.py
    - Standard profile fields (personal, professional, technical)
    - Custom field extensions
    - JSON Schema validation
    - Field type definitions and constraints
    
  3. Profile CLI Commands

    markitect profile create personal
    markitect profile set personal.name "John Doe"
    markitect profile set personal.email "john@example.com"
    markitect profile show personal
    markitect profile list
    markitect profile export personal profile.json
    

Phase 3: LLM-Powered Auto-Fill (Week 3)

  1. Smart Field Completion

    # markitect/template/auto_filler.py
    class LLMAutoFiller:
        - Use OpenRouter LLM for field suggestions
        - Context-aware completions based on template purpose
        - Profile-informed field values
        - Learning from user corrections
    
  2. Advanced Template Fill Modes

    markitect template-fill template.md --auto              # Auto-fill from profile
    markitect template-fill template.md --guided            # Mix auto + questions
    markitect template-fill template.md --profile=work      # Use specific profile
    markitect template-fill template.md --learn             # Learn from corrections
    

Phase 4: Advanced Features (Week 4)

  1. Field Intelligence

    • Template field learning and preferences
    • Content generation for complex fields
    • Multi-step form workflows
    • Field dependencies and conditional logic
  2. Integration Features

    • Template field suggestions based on existing content
    • Auto-population from MarkiTect database
    • Template version control and updates

Success Criteria

  • Interactive terminal questionnaire for template completion
  • User profile system with multiple profile support
  • LLM-powered auto-fill suggestions based on user profile
  • Enhanced template parser supporting field metadata
  • Seamless integration with existing template rendering system

🔗 Shared Infrastructure Requirements

Database Schema Extensions

-- User profiles table
CREATE TABLE user_profiles (
    id INTEGER PRIMARY KEY,
    name TEXT NOT NULL UNIQUE,
    data JSON NOT NULL,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

-- LLM interaction logs (optional)
CREATE TABLE llm_interactions (
    id INTEGER PRIMARY KEY,
    query TEXT NOT NULL,
    response TEXT NOT NULL,
    model TEXT NOT NULL,
    tokens_used INTEGER,
    cost REAL,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

-- Template usage history
CREATE TABLE template_usage (
    id INTEGER PRIMARY KEY,
    template_path TEXT NOT NULL,
    field_data JSON NOT NULL,
    profile_used TEXT,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

Configuration Extensions

# .markitect.yml additions
openrouter:
  api_key: "sk-or-..."
  default_model: "openai/gpt-4-turbo"
  max_tokens: 4096
  temperature: 0.7

profiles:
  default_profile: "personal"
  auto_save: true

templates:
  auto_fill_mode: "guided"  # auto, interactive, guided
  learn_from_corrections: true

📊 Implementation Priority Matrix

Component Issue Priority Effort Dependencies
OpenRouter Client #98 HIGH 2 days Config system
Context Builder #98 HIGH 3 days FTS, Database
Profile Manager #99 HIGH 2 days Database
Template Field Parser #99 HIGH 3 days Template system
Interactive Questionnaire #99 MEDIUM 4 days Profile system
LLM Auto-Fill #99 MEDIUM 4 days OpenRouter, Profiles
Natural Language Enhancement #98 MEDIUM 2 days OpenRouter, Context
Advanced Context #98 LOW 3 days Basic LLM working

🧪 Testing Strategy

Unit Tests Required

  • OpenRouter client error handling and retries
  • Template field parsing and validation
  • Profile CRUD operations
  • Context building with different content types
  • LLM response formatting and citation extraction

Integration Tests Required

  • End-to-end template filling workflow
  • Natural language queries with MarkiTect context
  • Profile-based auto-fill accuracy
  • CLI command integration

Manual Testing Scenarios

  1. OpenRouter Setup: User configures API key and tests connection
  2. Template Creation: User creates template with various field types
  3. Profile Management: User creates and manages multiple profiles
  4. Interactive Fill: User completes template via questionnaire
  5. Auto-Fill: System suggests field values based on profile
  6. LLM Queries: User asks questions about their content
  7. Context Accuracy: Verify LLM responses cite correct sources

🎯 Success Metrics & KPIs

Quantitative Metrics

  • Template Completion Time: Reduce by 60% with auto-fill
  • Query Response Accuracy: >90% relevant context inclusion
  • User Satisfaction: >8/10 rating for LLM responses
  • Profile Usage: >75% of template fills use profile data

Qualitative Metrics

  • User Experience: Seamless workflow integration
  • Content Discovery: Users find value in LLM-powered content exploration
  • Productivity: Templates become preferred method for document creation
  • Accuracy: LLM suggestions match user intent and context

🚧 Risk Assessment & Mitigation

Technical Risks

  1. OpenRouter API Changes: Mitigate with versioned API client and error handling
  2. Token Limits: Implement intelligent context truncation and chunking
  3. LLM Response Quality: Add response validation and fallback mechanisms
  4. Performance: Cache common queries and optimize context building

User Experience Risks

  1. Complex Configuration: Provide setup wizard and clear documentation
  2. Learning Curve: Include examples and guided tutorials
  3. Profile Privacy: Implement secure storage and optional features
  4. Cost Concerns: Add usage tracking and budget controls

📝 Requirements Engineering Notes

FOR REQUIREMENTS ENGINEER:

  1. User Research Needed:

    • Survey existing MarkiTect users about LLM integration preferences
    • Gather template usage patterns and pain points
    • Validate profile data schema with target users
  2. Technical Validation Required:

    • Verify OpenRouter API capabilities and limitations
    • Test LLM response quality with MarkiTect content types
    • Validate template field parsing edge cases
  3. Feature Prioritization:

    • Consider implementing #98 first for immediate value
    • #99 can follow as enhanced template experience
    • Both share OpenRouter infrastructure investment
  4. Alternative Approaches:

    • Consider other LLM providers (Anthropic direct, Azure OpenAI)
    • Evaluate local LLM options for privacy-conscious users
    • Template auto-fill could work without LLM (rule-based initially)
  5. Integration Points:

    • Leverage existing Query Paradigm system for #98
    • Build on solid template foundation for #99
    • Utilize configuration manager for seamless setup

RECOMMENDATION: Proceed with implementation in phases, starting with OpenRouter client and basic LLM integration for #98, then expanding to template auto-fill for #99. The shared infrastructure investment will benefit both features significantly.