Files
markitect-main/docs/WORKSPACE_AND_DATABASES.md
tegwick 7270bc559d
Some checks failed
Test Suite / unit-tests (3.11) (push) Has been cancelled
Test Suite / unit-tests (3.12) (push) Has been cancelled
Test Suite / integration-tests (push) Has been cancelled
Test Suite / e2e-tests (push) Has been cancelled
Test Suite / performance-tests (push) Has been cancelled
Test Suite / code-quality (push) Has been cancelled
Test Suite / security-scan (push) Has been cancelled
Test Suite / test-summary (push) Has been cancelled
docs: complete project documentation and task management cleanup
### Documentation Updates
- Added comprehensive WORKSPACE_AND_DATABASES.md documentation explaining:
  - Markitect's workspace-based architecture concept
  - Database separation (markitect.db vs assets.db) and purposes
  - Configuration management and asset integration
  - Best practices for development, collaboration, and production

### Changelog Management
- Updated CHANGELOG.md with complete release history coverage
- Added missing v0.8.0 entry for setuptools-SCM integration and release automation
- Added proper version comparison links for all releases
- Documented all recent work in Unreleased section following Keep a Changelog format

### Task Management
- Cleaned TODO.md file by removing all completed tasks
- Reset to clean state referencing changelog for completed work
- Maintained Keep a Todofile format for future development sessions

This completes the documentation and task management improvements for
the ChatGPT theme implementation, modular theme system, issue-facade
bug fixes, and workspace architecture clarification work.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-10 14:34:54 +01:00

7.9 KiB

Markitect Workspace and Database Architecture

This document explains Markitect's workspace concept and the two distinct database systems used by the application.

Workspace Concept

Markitect uses a workspace-based architecture where each directory or repository can have its own configuration and local data storage. This allows for flexible, per-project customization while maintaining a global user configuration.

Workspace Structure

When you initialize Markitect in a directory, it creates the following structure:

project-directory/
├── .markitect.yml              # Workspace configuration
├── .markitect_workspace/       # Local workspace data
├── .ast_cache/                 # AST parsing cache
├── assets/                     # Asset storage directory
│   ├── assets.db              # Asset management database
│   └── [asset files]          # Stored images, files, etc.
└── tests/                     # Test files directory

Configuration Files

Markitect searches for configuration in this order:

  1. .markitect.yml (current directory)
  2. .markitect.yaml (current directory)
  3. .markitect.json (current directory)
  4. markitect.config.yml (current directory)
  5. markitect.config.yaml (current directory)
  6. markitect.config.json (current directory)
  7. ~/.markitect/config.yml (user home directory)
  8. Environment variables (MARKITECT_*)
  9. Built-in defaults

Database Architecture

Markitect uses two distinct SQLite databases for different purposes:

1. Main Application Database (markitect.db)

Location: ~/.markitect/markitect.db (user home directory)

Purpose: Global user-level application data and configuration

Scope: User-wide, shared across all workspaces

Contents:

  • User preferences and settings
  • Application state information
  • Global configuration data
  • Cross-workspace data that needs persistence

Configuration: Set via MARKITECT_DATABASE_PATH environment variable or database_path in configuration

2. Asset Management Database (assets.db)

Location: assets/assets.db (within workspace asset storage directory)

Purpose: Asset management and tracking for the current workspace

Scope: Workspace-specific, local to each directory/repository

Contents:

  • Asset metadata (filename, size, MIME type, timestamps)
  • File content hashes for deduplication
  • Asset usage statistics and tracking
  • Processing logs and analytics
  • Asset relationships and dependencies

Schema (key tables):

-- Basic asset metadata
asset_metadata (
    content_hash TEXT PRIMARY KEY,
    filename TEXT NOT NULL,
    size_bytes INTEGER NOT NULL,
    mime_type TEXT,
    created_at TIMESTAMP,
    updated_at TIMESTAMP
)

-- Usage tracking
asset_usage_stats (
    content_hash TEXT,
    usage_count INTEGER,
    last_used TIMESTAMP,
    documents_using TEXT  -- JSON array of document paths
)

-- Performance and analytics tables
-- (Additional tables for caching, indexing, and optimization)

Why Two Databases?

This separation serves several important purposes:

Data Isolation

  • Global data (user preferences) stays in the user profile
  • Workspace data (asset files, metadata) stays with the project

Version Control Considerations

  • markitect.db is never committed to version control
  • assets.db is excluded via .gitignore (local workspace data)
  • Asset files themselves can be optionally committed based on project needs

Performance Optimization

  • Asset database operations are localized to relevant files
  • Global database isn't impacted by large asset collections
  • Each workspace can optimize its asset database independently

Portability and Collaboration

  • Workspaces can be moved/copied without affecting global configuration
  • Teams can share asset storage strategies without sharing personal settings
  • Different projects can have different asset management policies

Configuration Management

Workspace Initialization

To initialize a new workspace:

markitect config-init

This creates:

  1. .markitect.yml configuration file
  2. Required directories (.markitect_workspace, .ast_cache, tests)
  3. Asset storage structure

Configuration Commands

# View current configuration
markitect config-show

# Set workspace-specific values
markitect config-set repo_name "my-project"
markitect config-set assets.storage_path "./assets"

# View configuration help
markitect config-help

Environment Variables

Override configuration with environment variables:

export MARKITECT_GITEA_URL="http://localhost:3000"
export MARKITECT_WORKSPACE_DIR=".custom_workspace"
export MARKITECT_DATABASE_PATH="/custom/path/markitect.db"

Asset Management Integration

The asset management system coordinates between the asset database and file storage:

from markitect.assets import AssetManager

# Initialize with workspace-specific configuration
asset_manager = AssetManager()

# Assets are stored in workspace, tracked in assets.db
asset = asset_manager.add_asset("image.png")

Asset Storage Workflow

  1. File Processing: Asset files are processed and stored in assets/ directory
  2. Database Recording: Metadata recorded in assets.db
  3. Deduplication: Content hashes prevent duplicate storage
  4. Usage Tracking: Document usage recorded for analytics
  5. Cleanup: Unused assets can be identified and removed

Best Practices

For Development

  • Initialize workspace in each project directory
  • Commit .markitect.yml to version control
  • Add assets.db and workspace directories to .gitignore
  • Use relative paths in workspace configuration

For Collaboration

  • Share workspace configuration (.markitect.yml)
  • Document asset storage strategy for the team
  • Establish conventions for asset organization
  • Consider asset file version control policies

for Production

  • Backup both databases separately
  • Monitor asset database growth in large projects
  • Use environment variables for deployment-specific settings
  • Implement asset cleanup strategies for long-running projects

Migration and Compatibility

Legacy Support

The system maintains backward compatibility:

  • Old configuration patterns still work
  • Automatic migration of legacy settings
  • Graceful fallbacks for missing configuration

Database Migration

Asset databases support schema migrations:

  • Automatic schema updates on version changes
  • Backward compatibility preservation
  • Safe migration with rollback capability

Troubleshooting

Common Issues

Database Connection Errors:

  • Check file permissions on database directories
  • Verify disk space availability
  • Ensure SQLite3 is available

Configuration Not Found:

  • Verify .markitect.yml exists and is valid YAML
  • Check environment variable names and values
  • Run markitect config-show to see current configuration

Asset Storage Issues:

  • Confirm asset directory permissions
  • Check available disk space
  • Verify assets.db is not corrupted

Recovery Procedures

Corrupted Asset Database:

# Backup and recreate
mv assets/assets.db assets/assets.db.backup
# Restart Markitect to recreate schema
markitect config-show

Missing Configuration:

# Reinitialize workspace
markitect config-init

Technical Details

Database Connections

  • Uses SQLite3 with connection pooling for performance
  • Automatic connection management and cleanup
  • Thread-safe operations for concurrent access

File System Integration

  • Path resolution relative to workspace root
  • Cross-platform path handling (Windows, macOS, Linux)
  • Symlink and junction support where available

Security Considerations

  • Database files have restricted permissions
  • No sensitive data stored in asset database
  • Configuration masking for sensitive values in display

This documentation reflects the current architecture as of November 2025. For implementation details, see the source code in markitect/config_manager.py and markitect/assets/.