Files
markitect-main/SCOPE.md
tegwick 479fa95fdf
Some checks failed
Test Suite / unit-tests (3.11) (push) Has been cancelled
Test Suite / unit-tests (3.12) (push) Has been cancelled
Test Suite / code-quality (push) Has been cancelled
Test Suite / security-scan (push) Has been cancelled
Test Suite / integration-tests (push) Has been cancelled
Test Suite / e2e-tests (push) Has been cancelled
Test Suite / performance-tests (push) Has been cancelled
Test Suite / test-summary (push) Has been cancelled
Scope update from repo-scoping refactor
2026-05-01 12:27:17 +02:00

130 lines
6.1 KiB
Markdown

# SCOPE
> This file helps you quickly understand what this repository is about,
> when it is relevant, and when it is not.
> It is intentionally lightweight and may be incomplete.
---
## One-liner
Intelligent markdown engine and information management platform — treats documents as structured, queryable information spaces with schema validation, transclusion, LLM-driven evaluation, and infospace lifecycle management.
---
## Core Idea
MarkiTect turns fragmented knowledge (scattered docs, chats, notes) into structured, versioned, reusable artifacts. The core abstraction is an **infospace**: a curated collection of typed entities (concepts, mechanisms, observations) governed by a YAML config, validated against schemas, and evaluated for quality across five dimensions. The platform automates generation, validation, and transformation at scale, delegating domain-level judgment to LLMs while Python handles structure and evaluation.
---
## In Scope
- Parse, validate, and analyze markdown documents against schemas
- Generate schemas from example documents; enforce naming convention `{domain}-schema-v{major}.{minor}.md`
- Infospace lifecycle: create, populate, evaluate (per-entity + collection quality scores), compose, export
- Transclusion: embed content from one document into another, maintaining single source of truth
- LLM-driven prompt execution with dependency resolution and quality gates
- Relationship graph export (Mermaid, DOT) and analysis (networkx, FCA)
- Batch document processing; CLI (`markitect <command>`) and programmatic API
- Rendering: markdown → interactive HTML via plugin system (testdrive-jsui)
- Asset management (image embedding, resource handling)
---
## Out of Scope
- Visual/WYSIWYG editing (markdown-first, text-based workflows only)
- Real-time collaborative editing (git-based versioning instead)
- Financial transactions or external payment integration
- Making domain-level judgments in Python code (delegated to LLM via prompt templates)
- Storing secrets or credentials in plaintext
- Full GraphQL API (structure exists but not fully implemented)
- Vendor-specific integrations or lock-in
---
## Relevant When
- Managing large document sets (hundreds to thousands) needing consistent structure and validation
- Building or maintaining institutional knowledge bases, technical documentation, or canon releases
- Automating document generation from schemas or templates
- Tracking relationships and dependencies between knowledge artifacts
- Needing programmatic access to document structure (beyond file reading)
- Applying quality evaluation to a structured concept collection
---
## Not Relevant When
- Working with a handful of simple, unrelated documents
- Visual editor required
- Exclusively non-markdown source formats (PDF/Word need conversion first)
- No consistency, validation, or automation needed
---
## Current State
- Status: active (v0.13.0-dev, ~90 commits ahead of release)
- Implementation: substantial — core modules mature (CLI, parsing, schema management, prompt execution, infospace); infospace S3 close-out in progress; LLM adapter extracted to standalone `llm-connect` package
- Stability: stable core; plugin system and infospace tooling evolving; 200+ CHANGELOG entries since v0.6.0
- Usage: active personal development; examples with 988 entities and full evaluation pipeline
---
## How It Fits
- Upstream dependencies: `llm-connect` (LLM adapter library, extracted), `testdrive-jsui` (rendering plugin submodule), `markitect-utils` (utility library)
- Downstream consumers: Custodian — MarkiTect is the knowledge artifact platform in the canonical dependency order (Railiance → **Markitect** → Coulomb.social → Personhood/Foerster → Custodian)
- Often used with: the-custodian (state hub tracks markitect domain workstreams), kaizen-agentic (project-management agent for session workflow)
---
## Terminology
- Preferred terms: infospace, topic, discipline, entity, evaluation, viability, transclusion, schema, quality gates
- Also known as: "markitect", "the markdown engine"
- Potentially confusing terms: "topic" = the subject matter an infospace explains (not a chat thread); "discipline" = a reusable framework of concepts (itself a viable infospace); "infospace" ≠ filesystem directory (it's a curated conceptual collection with explicit quality thresholds)
---
## Related / Overlapping
- `llm-connect` — standalone LLM adapter extracted from MarkiTect (dependency)
- `the-custodian` — tracks markitect workstreams; custodian canon includes a markitect domain charter
- `marki-docx` — separate repo (on tegwick machine); relationship: docx export capability for MarkiTect artifacts
---
## Provided Capabilities
```capability
type: documentation
title: Structured document validation and schema management
description: Parse, validate, and enforce schemas on markdown documents — generate schemas from examples, validate entity collections, report naming convention compliance.
keywords: [markdown, schema, validation, document, structure, linting]
```
```capability
type: documentation
title: Infospace lifecycle management
description: Create, populate, evaluate (quality scores), compose, and export curated knowledge collections (infospaces) with transclusion and relationship graph analysis.
keywords: [infospace, knowledge, curation, evaluation, transclusion, quality, graph]
```
```capability
type: data
title: LLM-driven knowledge artifact generation
description: Execute prompts with dependency resolution and quality gates to generate typed entities — concepts, mechanisms, observations — at scale from schemas and templates.
keywords: [llm, generation, prompt, entity, artifact, knowledge, automation]
```
---
## Getting Oriented
- Start with: `CLAUDE.md` (dev commands, LLM config, infospace lifecycle), `INTRODUCTION.md` (use cases, philosophy)
- Key files / directories: `markitect/cli.py` (CLI entry point), `markitect/infospace/` (primary active area), `markitect/prompts/` (LLM execution), `roadmap/` (6 active planning tracks), `examples/infospace-with-history/` (988-entity reference implementation)
- Entry points: `markitect --help`; `markitect infospace --help`; `pytest tests/unit/` (inner TDD loop)