# Markitect Workspace and Database Architecture This document explains Markitect's workspace concept and the two distinct database systems used by the application. ## Workspace Concept Markitect uses a **workspace-based architecture** where each directory or repository can have its own configuration and local data storage. This allows for flexible, per-project customization while maintaining a global user configuration. ### Workspace Structure When you initialize Markitect in a directory, it creates the following structure: ``` project-directory/ ├── .markitect.yml # Workspace configuration ├── .markitect_workspace/ # Local workspace data ├── .ast_cache/ # AST parsing cache ├── assets/ # Asset storage directory │ ├── assets.db # Asset management database │ └── [asset files] # Stored images, files, etc. └── tests/ # Test files directory ``` ### Configuration Files Markitect searches for configuration in this order: 1. `.markitect.yml` (current directory) 2. `.markitect.yaml` (current directory) 3. `.markitect.json` (current directory) 4. `markitect.config.yml` (current directory) 5. `markitect.config.yaml` (current directory) 6. `markitect.config.json` (current directory) 7. `~/.markitect/config.yml` (user home directory) 8. Environment variables (`MARKITECT_*`) 9. Built-in defaults ## Database Architecture Markitect uses two distinct SQLite databases for different purposes: ### 1. Main Application Database (`markitect.db`) **Location**: `~/.markitect/markitect.db` (user home directory) **Purpose**: Global user-level application data and configuration **Scope**: User-wide, shared across all workspaces **Contents**: - User preferences and settings - Application state information - Global configuration data - Cross-workspace data that needs persistence **Configuration**: Set via `MARKITECT_DATABASE_PATH` environment variable or `database_path` in configuration ### 2. Asset Management Database (`assets.db`) **Location**: `assets/assets.db` (within workspace asset storage directory) **Purpose**: Asset management and tracking for the current workspace **Scope**: Workspace-specific, local to each directory/repository **Contents**: - Asset metadata (filename, size, MIME type, timestamps) - File content hashes for deduplication - Asset usage statistics and tracking - Processing logs and analytics - Asset relationships and dependencies **Schema** (key tables): ```sql -- Basic asset metadata asset_metadata ( content_hash TEXT PRIMARY KEY, filename TEXT NOT NULL, size_bytes INTEGER NOT NULL, mime_type TEXT, created_at TIMESTAMP, updated_at TIMESTAMP ) -- Usage tracking asset_usage_stats ( content_hash TEXT, usage_count INTEGER, last_used TIMESTAMP, documents_using TEXT -- JSON array of document paths ) -- Performance and analytics tables -- (Additional tables for caching, indexing, and optimization) ``` ## Why Two Databases? This separation serves several important purposes: ### Data Isolation - **Global data** (user preferences) stays in the user profile - **Workspace data** (asset files, metadata) stays with the project ### Version Control Considerations - `markitect.db` is never committed to version control - `assets.db` is excluded via `.gitignore` (local workspace data) - Asset files themselves can be optionally committed based on project needs ### Performance Optimization - Asset database operations are localized to relevant files - Global database isn't impacted by large asset collections - Each workspace can optimize its asset database independently ### Portability and Collaboration - Workspaces can be moved/copied without affecting global configuration - Teams can share asset storage strategies without sharing personal settings - Different projects can have different asset management policies ## Configuration Management ### Workspace Initialization To initialize a new workspace: ```bash markitect config-init ``` This creates: 1. `.markitect.yml` configuration file 2. Required directories (`.markitect_workspace`, `.ast_cache`, `tests`) 3. Asset storage structure ### Configuration Commands ```bash # View current configuration markitect config-show # Set workspace-specific values markitect config-set repo_name "my-project" markitect config-set assets.storage_path "./assets" # View configuration help markitect config-help ``` ### Environment Variables Override configuration with environment variables: ```bash export MARKITECT_GITEA_URL="http://localhost:3000" export MARKITECT_WORKSPACE_DIR=".custom_workspace" export MARKITECT_DATABASE_PATH="/custom/path/markitect.db" ``` ## Asset Management Integration The asset management system coordinates between the asset database and file storage: ```python from markitect.assets import AssetManager # Initialize with workspace-specific configuration asset_manager = AssetManager() # Assets are stored in workspace, tracked in assets.db asset = asset_manager.add_asset("image.png") ``` ### Asset Storage Workflow 1. **File Processing**: Asset files are processed and stored in `assets/` directory 2. **Database Recording**: Metadata recorded in `assets.db` 3. **Deduplication**: Content hashes prevent duplicate storage 4. **Usage Tracking**: Document usage recorded for analytics 5. **Cleanup**: Unused assets can be identified and removed ## Best Practices ### For Development - Initialize workspace in each project directory - Commit `.markitect.yml` to version control - Add `assets.db` and workspace directories to `.gitignore` - Use relative paths in workspace configuration ### For Collaboration - Share workspace configuration (`.markitect.yml`) - Document asset storage strategy for the team - Establish conventions for asset organization - Consider asset file version control policies ### for Production - Backup both databases separately - Monitor asset database growth in large projects - Use environment variables for deployment-specific settings - Implement asset cleanup strategies for long-running projects ## Migration and Compatibility ### Legacy Support The system maintains backward compatibility: - Old configuration patterns still work - Automatic migration of legacy settings - Graceful fallbacks for missing configuration ### Database Migration Asset databases support schema migrations: - Automatic schema updates on version changes - Backward compatibility preservation - Safe migration with rollback capability ## Troubleshooting ### Common Issues **Database Connection Errors**: - Check file permissions on database directories - Verify disk space availability - Ensure SQLite3 is available **Configuration Not Found**: - Verify `.markitect.yml` exists and is valid YAML - Check environment variable names and values - Run `markitect config-show` to see current configuration **Asset Storage Issues**: - Confirm asset directory permissions - Check available disk space - Verify `assets.db` is not corrupted ### Recovery Procedures **Corrupted Asset Database**: ```bash # Backup and recreate mv assets/assets.db assets/assets.db.backup # Restart Markitect to recreate schema markitect config-show ``` **Missing Configuration**: ```bash # Reinitialize workspace markitect config-init ``` ## Technical Details ### Database Connections - Uses SQLite3 with connection pooling for performance - Automatic connection management and cleanup - Thread-safe operations for concurrent access ### File System Integration - Path resolution relative to workspace root - Cross-platform path handling (Windows, macOS, Linux) - Symlink and junction support where available ### Security Considerations - Database files have restricted permissions - No sensitive data stored in asset database - Configuration masking for sensitive values in display --- *This documentation reflects the current architecture as of November 2025. For implementation details, see the source code in `markitect/config_manager.py` and `markitect/assets/`.*