Files
markitect-main/docs/WORKSPACE_AND_DATABASES.md
tegwick 7270bc559d
Some checks failed
Test Suite / unit-tests (3.11) (push) Has been cancelled
Test Suite / unit-tests (3.12) (push) Has been cancelled
Test Suite / integration-tests (push) Has been cancelled
Test Suite / e2e-tests (push) Has been cancelled
Test Suite / performance-tests (push) Has been cancelled
Test Suite / code-quality (push) Has been cancelled
Test Suite / security-scan (push) Has been cancelled
Test Suite / test-summary (push) Has been cancelled
docs: complete project documentation and task management cleanup
### Documentation Updates
- Added comprehensive WORKSPACE_AND_DATABASES.md documentation explaining:
  - Markitect's workspace-based architecture concept
  - Database separation (markitect.db vs assets.db) and purposes
  - Configuration management and asset integration
  - Best practices for development, collaboration, and production

### Changelog Management
- Updated CHANGELOG.md with complete release history coverage
- Added missing v0.8.0 entry for setuptools-SCM integration and release automation
- Added proper version comparison links for all releases
- Documented all recent work in Unreleased section following Keep a Changelog format

### Task Management
- Cleaned TODO.md file by removing all completed tasks
- Reset to clean state referencing changelog for completed work
- Maintained Keep a Todofile format for future development sessions

This completes the documentation and task management improvements for
the ChatGPT theme implementation, modular theme system, issue-facade
bug fixes, and workspace architecture clarification work.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-10 14:34:54 +01:00

268 lines
7.9 KiB
Markdown

# Markitect Workspace and Database Architecture
This document explains Markitect's workspace concept and the two distinct database systems used by the application.
## Workspace Concept
Markitect uses a **workspace-based architecture** where each directory or repository can have its own configuration and local data storage. This allows for flexible, per-project customization while maintaining a global user configuration.
### Workspace Structure
When you initialize Markitect in a directory, it creates the following structure:
```
project-directory/
├── .markitect.yml # Workspace configuration
├── .markitect_workspace/ # Local workspace data
├── .ast_cache/ # AST parsing cache
├── assets/ # Asset storage directory
│ ├── assets.db # Asset management database
│ └── [asset files] # Stored images, files, etc.
└── tests/ # Test files directory
```
### Configuration Files
Markitect searches for configuration in this order:
1. `.markitect.yml` (current directory)
2. `.markitect.yaml` (current directory)
3. `.markitect.json` (current directory)
4. `markitect.config.yml` (current directory)
5. `markitect.config.yaml` (current directory)
6. `markitect.config.json` (current directory)
7. `~/.markitect/config.yml` (user home directory)
8. Environment variables (`MARKITECT_*`)
9. Built-in defaults
## Database Architecture
Markitect uses two distinct SQLite databases for different purposes:
### 1. Main Application Database (`markitect.db`)
**Location**: `~/.markitect/markitect.db` (user home directory)
**Purpose**: Global user-level application data and configuration
**Scope**: User-wide, shared across all workspaces
**Contents**:
- User preferences and settings
- Application state information
- Global configuration data
- Cross-workspace data that needs persistence
**Configuration**: Set via `MARKITECT_DATABASE_PATH` environment variable or `database_path` in configuration
### 2. Asset Management Database (`assets.db`)
**Location**: `assets/assets.db` (within workspace asset storage directory)
**Purpose**: Asset management and tracking for the current workspace
**Scope**: Workspace-specific, local to each directory/repository
**Contents**:
- Asset metadata (filename, size, MIME type, timestamps)
- File content hashes for deduplication
- Asset usage statistics and tracking
- Processing logs and analytics
- Asset relationships and dependencies
**Schema** (key tables):
```sql
-- Basic asset metadata
asset_metadata (
content_hash TEXT PRIMARY KEY,
filename TEXT NOT NULL,
size_bytes INTEGER NOT NULL,
mime_type TEXT,
created_at TIMESTAMP,
updated_at TIMESTAMP
)
-- Usage tracking
asset_usage_stats (
content_hash TEXT,
usage_count INTEGER,
last_used TIMESTAMP,
documents_using TEXT -- JSON array of document paths
)
-- Performance and analytics tables
-- (Additional tables for caching, indexing, and optimization)
```
## Why Two Databases?
This separation serves several important purposes:
### Data Isolation
- **Global data** (user preferences) stays in the user profile
- **Workspace data** (asset files, metadata) stays with the project
### Version Control Considerations
- `markitect.db` is never committed to version control
- `assets.db` is excluded via `.gitignore` (local workspace data)
- Asset files themselves can be optionally committed based on project needs
### Performance Optimization
- Asset database operations are localized to relevant files
- Global database isn't impacted by large asset collections
- Each workspace can optimize its asset database independently
### Portability and Collaboration
- Workspaces can be moved/copied without affecting global configuration
- Teams can share asset storage strategies without sharing personal settings
- Different projects can have different asset management policies
## Configuration Management
### Workspace Initialization
To initialize a new workspace:
```bash
markitect config-init
```
This creates:
1. `.markitect.yml` configuration file
2. Required directories (`.markitect_workspace`, `.ast_cache`, `tests`)
3. Asset storage structure
### Configuration Commands
```bash
# View current configuration
markitect config-show
# Set workspace-specific values
markitect config-set repo_name "my-project"
markitect config-set assets.storage_path "./assets"
# View configuration help
markitect config-help
```
### Environment Variables
Override configuration with environment variables:
```bash
export MARKITECT_GITEA_URL="http://localhost:3000"
export MARKITECT_WORKSPACE_DIR=".custom_workspace"
export MARKITECT_DATABASE_PATH="/custom/path/markitect.db"
```
## Asset Management Integration
The asset management system coordinates between the asset database and file storage:
```python
from markitect.assets import AssetManager
# Initialize with workspace-specific configuration
asset_manager = AssetManager()
# Assets are stored in workspace, tracked in assets.db
asset = asset_manager.add_asset("image.png")
```
### Asset Storage Workflow
1. **File Processing**: Asset files are processed and stored in `assets/` directory
2. **Database Recording**: Metadata recorded in `assets.db`
3. **Deduplication**: Content hashes prevent duplicate storage
4. **Usage Tracking**: Document usage recorded for analytics
5. **Cleanup**: Unused assets can be identified and removed
## Best Practices
### For Development
- Initialize workspace in each project directory
- Commit `.markitect.yml` to version control
- Add `assets.db` and workspace directories to `.gitignore`
- Use relative paths in workspace configuration
### For Collaboration
- Share workspace configuration (`.markitect.yml`)
- Document asset storage strategy for the team
- Establish conventions for asset organization
- Consider asset file version control policies
### for Production
- Backup both databases separately
- Monitor asset database growth in large projects
- Use environment variables for deployment-specific settings
- Implement asset cleanup strategies for long-running projects
## Migration and Compatibility
### Legacy Support
The system maintains backward compatibility:
- Old configuration patterns still work
- Automatic migration of legacy settings
- Graceful fallbacks for missing configuration
### Database Migration
Asset databases support schema migrations:
- Automatic schema updates on version changes
- Backward compatibility preservation
- Safe migration with rollback capability
## Troubleshooting
### Common Issues
**Database Connection Errors**:
- Check file permissions on database directories
- Verify disk space availability
- Ensure SQLite3 is available
**Configuration Not Found**:
- Verify `.markitect.yml` exists and is valid YAML
- Check environment variable names and values
- Run `markitect config-show` to see current configuration
**Asset Storage Issues**:
- Confirm asset directory permissions
- Check available disk space
- Verify `assets.db` is not corrupted
### Recovery Procedures
**Corrupted Asset Database**:
```bash
# Backup and recreate
mv assets/assets.db assets/assets.db.backup
# Restart Markitect to recreate schema
markitect config-show
```
**Missing Configuration**:
```bash
# Reinitialize workspace
markitect config-init
```
## Technical Details
### Database Connections
- Uses SQLite3 with connection pooling for performance
- Automatic connection management and cleanup
- Thread-safe operations for concurrent access
### File System Integration
- Path resolution relative to workspace root
- Cross-platform path handling (Windows, macOS, Linux)
- Symlink and junction support where available
### Security Considerations
- Database files have restricted permissions
- No sensitive data stored in asset database
- Configuration masking for sensitive values in display
---
*This documentation reflects the current architecture as of November 2025. For implementation details, see the source code in `markitect/config_manager.py` and `markitect/assets/`.*