diff --git a/SCOPE.md b/SCOPE.md new file mode 100644 index 00000000..0571392e --- /dev/null +++ b/SCOPE.md @@ -0,0 +1,104 @@ +# SCOPE + +> This file helps you quickly understand what this repository is about, +> when it is relevant, and when it is not. +> It is intentionally lightweight and may be incomplete. + +--- + +## One-liner + +Intelligent markdown engine and information management platform — treats documents as structured, queryable information spaces with schema validation, transclusion, LLM-driven evaluation, and infospace lifecycle management. + +--- + +## Core Idea + +MarkiTect turns fragmented knowledge (scattered docs, chats, notes) into structured, versioned, reusable artifacts. The core abstraction is an **infospace**: a curated collection of typed entities (concepts, mechanisms, observations) governed by a YAML config, validated against schemas, and evaluated for quality across five dimensions. The platform automates generation, validation, and transformation at scale, delegating domain-level judgment to LLMs while Python handles structure and evaluation. + +--- + +## In Scope + +- Parse, validate, and analyze markdown documents against schemas +- Generate schemas from example documents; enforce naming convention `{domain}-schema-v{major}.{minor}.md` +- Infospace lifecycle: create, populate, evaluate (per-entity + collection quality scores), compose, export +- Transclusion: embed content from one document into another, maintaining single source of truth +- LLM-driven prompt execution with dependency resolution and quality gates +- Relationship graph export (Mermaid, DOT) and analysis (networkx, FCA) +- Batch document processing; CLI (`markitect `) and programmatic API +- Rendering: markdown → interactive HTML via plugin system (testdrive-jsui) +- Asset management (image embedding, resource handling) + +--- + +## Out of Scope + +- Visual/WYSIWYG editing (markdown-first, text-based workflows only) +- Real-time collaborative editing (git-based versioning instead) +- Financial transactions or external payment integration +- Making domain-level judgments in Python code (delegated to LLM via prompt templates) +- Storing secrets or credentials in plaintext +- Full GraphQL API (structure exists but not fully implemented) +- Vendor-specific integrations or lock-in + +--- + +## Relevant When + +- Managing large document sets (hundreds to thousands) needing consistent structure and validation +- Building or maintaining institutional knowledge bases, technical documentation, or canon releases +- Automating document generation from schemas or templates +- Tracking relationships and dependencies between knowledge artifacts +- Needing programmatic access to document structure (beyond file reading) +- Applying quality evaluation to a structured concept collection + +--- + +## Not Relevant When + +- Working with a handful of simple, unrelated documents +- Visual editor required +- Exclusively non-markdown source formats (PDF/Word need conversion first) +- No consistency, validation, or automation needed + +--- + +## Current State + +- Status: active (v0.13.0-dev, ~90 commits ahead of release) +- Implementation: substantial — core modules mature (CLI, parsing, schema management, prompt execution, infospace); infospace S3 close-out in progress; LLM adapter extracted to standalone `llm-connect` package +- Stability: stable core; plugin system and infospace tooling evolving; 200+ CHANGELOG entries since v0.6.0 +- Usage: active personal development; examples with 988 entities and full evaluation pipeline + +--- + +## How It Fits + +- Upstream dependencies: `llm-connect` (LLM adapter library, extracted), `testdrive-jsui` (rendering plugin submodule), `markitect-utils` (utility library) +- Downstream consumers: Custodian — MarkiTect is the knowledge artifact platform in the canonical dependency order (Railiance → **Markitect** → Coulomb.social → Personhood/Foerster → Custodian) +- Often used with: the-custodian (state hub tracks markitect domain workstreams), kaizen-agentic (project-management agent for session workflow) + +--- + +## Terminology + +- Preferred terms: infospace, topic, discipline, entity, evaluation, viability, transclusion, schema, quality gates +- Also known as: "markitect", "the markdown engine" +- Potentially confusing terms: "topic" = the subject matter an infospace explains (not a chat thread); "discipline" = a reusable framework of concepts (itself a viable infospace); "infospace" ≠ filesystem directory (it's a curated conceptual collection with explicit quality thresholds) + +--- + +## Related / Overlapping Repositories + +- `llm-connect` — standalone LLM adapter extracted from MarkiTect (dependency) +- `the-custodian` — tracks markitect workstreams; custodian canon includes a markitect domain charter +- `marki-docx` — separate repo (on tegwick machine); relationship: docx export capability for MarkiTect artifacts + +--- + +## Getting Oriented + +- Start with: `CLAUDE.md` (dev commands, LLM config, infospace lifecycle), `INTRODUCTION.md` (use cases, philosophy) +- Key files / directories: `markitect/cli.py` (CLI entry point), `markitect/infospace/` (primary active area), `markitect/prompts/` (LLM execution), `roadmap/` (6 active planning tracks), `examples/infospace-with-history/` (988-entity reference implementation) +- Entry points: `markitect --help`; `markitect infospace --help`; `pytest tests/unit/` (inner TDD loop)