Intent and specification files

2026-05-03 19:26:51 +02:00
parent 7b5161159a
commit 92ae074603
3 changed files with 500 additions and 0 deletions
--- a/INTENT.md
+++ b/INTENT.md
@@ -33,6 +33,20 @@ It turns markdown from plain text into a **programmable knowledge substrate**.

 ## Strategic Role in the System

+This repository is part of a layered knowledge system with clearly separated responsibilities:
+
+- **markitect-tool**     → makes markdown structured and manipulable
+- kontextual-engine  → makes knowledge persistent and operable
+- infospace-bench    → makes knowledge concrete and meaningful
+
+These layers correspond to a deliberate separation of concerns:
+
+* **Syntax layer** — structuring and transforming semi-structured data (markdown)
+* **System layer** — operating, persisting, and orchestrating knowledge
+* **Application layer** — applying knowledge systems to real-world contexts
+
+This repository occupies the **syntax layer** and should maintain **clear boundaries** to the others.
+
 This repository acts as the **foundation layer for markdown-based knowledge systems**:

 * It provides **provider-neutral primitives** for working with structured markdown
--- a/wiki/FunctionalRequirementsSpecification.md
+++ b/wiki/FunctionalRequirementsSpecification.md
@@ -0,0 +1,261 @@
+# Markitect Tool Functional Requirements Specification V0.1
+
+## markitect-tool
+
+---
+
+## 1. System Overview
+
+markitect-tool is a **markdown-native toolkit and CLI** that enables users and systems to **parse, validate, transform, query, and generate markdown-based knowledge artifacts**.
+
+This FRS defines the **externally observable functional behavior** of the system.
+
+---
+
+## 2. Actors and Interfaces
+
+### 2.1 Primary Actors
+
+* **User (Human Operator)** via CLI (`mkt`)
+* **Automation System (`atm`)** via CLI or API
+* **LLM Agent (`agt`)** via API or CLI orchestration
+* **External Systems** integrating via programmatic interface
+
+---
+
+### 2.2 System Interfaces
+
+* CLI interface (`mkt <command>`)
+* Programmatic API (library usage)
+* File system (markdown documents and configuration files)
+
+---
+
+## 3. Functional Requirements
+
+---
+
+## 3.1 Markdown Parsing and Structuring
+
+### FR-001: Parse Markdown into Structured Representation
+
+**Description:**
+The system must parse markdown documents into a structured, machine-interpretable representation.
+
+**Input:**
+
+* Markdown file(s)
+
+**Output:**
+
+* Structured representation (accessible via CLI/API)
+
+---
+
+### FR-002: Preserve Structural Elements
+
+The system must preserve:
+
+* Sections and headings
+* Metadata (e.g. frontmatter)
+* Content blocks
+
+in the structured representation.
+
+---
+
+## 3.2 Schema Definition and Validation
+
+### FR-010: Define Schema from Input
+
+The system must allow users to define or derive schemas from markdown documents.
+
+---
+
+### FR-011: Validate Documents Against Schema
+
+**Input:**
+
+* Markdown document(s)
+* Schema definition
+
+**Output:**
+
+* Validation result indicating compliance or violations
+
+---
+
+### FR-012: Report Validation Results
+
+The system must provide:
+
+* Clear identification of violations
+* Location/context of errors
+
+---
+
+## 3.3 Transformation and Composition
+
+### FR-020: Transform Markdown Documents
+
+The system must allow transformation of markdown documents based on defined rules or operations.
+
+---
+
+### FR-021: Compose Documents from Multiple Sources
+
+The system must support combining multiple markdown inputs into a single output document.
+
+---
+
+### FR-022: Support Content Inclusion (Transclusion)
+
+The system must allow content from one document to be included in another.
+
+---
+
+## 3.4 Query and Extraction
+
+### FR-030: Query Structured Content
+
+The system must allow querying of structured representations of markdown documents.
+
+---
+
+### FR-031: Extract Content Based on Criteria
+
+**Input:**
+
+* Query parameters
+
+**Output:**
+
+* Matching content or elements
+
+---
+
+## 3.5 Templating and Generation
+
+### FR-040: Generate Markdown from Templates
+
+The system must generate markdown documents based on:
+
+* Input content
+* Templates and/or rules
+
+---
+
+### FR-041: Support Rule-Based Generation
+
+The system must allow generation driven by defined rules expressed in markdown or configuration.
+
+---
+
+### FR-042: Support LLM-Assisted Generation
+
+The system must support generation workflows that incorporate LLM-based processing.
+
+(Note: LLM usage is optional and externally provided.)
+
+---
+
+## 3.6 Automation and Workflow Execution
+
+### FR-050: Execute Operations via CLI
+
+The system must expose all core functions via CLI commands.
+
+---
+
+### FR-051: Support Batch Processing
+
+The system must allow operations to be applied to multiple documents in a single execution.
+
+---
+
+### FR-052: Support Repeatable Workflows
+
+The system must allow the same operation to be executed repeatedly with consistent results for identical inputs.
+
+---
+
+## 3.7 Configuration Handling
+
+### FR-060: Load Configuration
+
+The system must load configuration from:
+
+* Files (e.g. project-level configuration)
+* Environment
+
+---
+
+### FR-061: Apply Configuration to Operations
+
+The system must apply configuration consistently across operations.
+
+---
+
+## 3.8 Caching and Incremental Processing
+
+### FR-070: Cache Processing Results
+
+The system must store intermediate or final results to avoid redundant computation.
+
+---
+
+### FR-071: Detect Changes
+
+The system must detect changes in input data and reprocess only affected parts.
+
+---
+
+## 3.9 Error Handling
+
+### FR-080: Provide Structured Errors
+
+The system must return structured error information for:
+
+* Invalid input
+* Schema violations
+* Execution failures
+
+---
+
+### FR-081: Avoid Silent Failures
+
+The system must not silently ignore errors that affect output correctness.
+
+---
+
+## 4. Functional Constraints
+
+* All functions must operate on **markdown-based input/output**
+* LLM-assisted functions must degrade gracefully when unavailable
+* Operations must not require persistent system infrastructure
+
+---
+
+## 5. Traceability
+
+| PRD Concept                      | FRS Coverage  |
+| -------------------------------- | ------------- |
+| Structured markdown manipulation | FR-001–FR-022 |
+| Schema validation                | FR-010–FR-012 |
+| Querying and extraction          | FR-030–FR-031 |
+| Templating and generation        | FR-040–FR-042 |
+| CLI and automation               | FR-050–FR-052 |
+| Configuration                    | FR-060–FR-061 |
+| Efficiency via caching           | FR-070–FR-071 |
+
+---
+
+## 6. Acceptance Perspective
+
+The system satisfies this FRS when:
+
+* Each function can be invoked via CLI or API
+* Outputs match defined input–output expectations
+* Validation, transformation, and generation behaviors are observable and verifiable
+* Errors are explicit and traceable
+
--- a/wiki/ProductRequirementsDocument.md
+++ b/wiki/ProductRequirementsDocument.md
@@ -0,0 +1,225 @@
+# Markitect Tool Product Requirements Document V0.1
+
+## markitect-tool
+
+---
+
+## 1. Product Overview
+
+### 1.1 Product Name
+
+**markitect-tool** (`markitect`, CLI alias: `mkt`)
+
+### 1.2 Product Definition
+
+markitect-tool is a **markdown-native toolkit and CLI** that transforms semi-structured markdown into **structured, queryable, and reusable knowledge artifacts**.
+
+It provides a **contract layer between markdown and structured knowledge operations**, enabling deterministic and LLM-assisted workflows without binding consumers to specific tools, formats, or providers.
+
+---
+
+## 2. Product Intent
+
+### 2.1 Problem Statement
+
+Markdown is widely used but suffers from:
+
+* Lack of enforced structure and consistency
+* Limited automation and transformation capabilities
+* Weak support for large-scale knowledge management
+* Tight coupling to manual workflows or ad-hoc tooling
+
+This results in fragmented, hard-to-maintain knowledge systems.
+
+---
+
+### 2.2 Intended Outcomes
+
+markitect-tool enables:
+
+* Treating markdown as **structured data rather than plain text**
+* Reliable **validation, transformation, and composition** of documents
+* Efficient **automation and reuse of knowledge artifacts**
+* A stable interface for **human and AI interaction with knowledge**
+
+---
+
+### 2.3 Success Criteria
+
+The product is successful when:
+
+* Users can **operate on large markdown corpora** with consistency and automation
+* Applications can depend on **provider-neutral, stable primitives**
+* LLM agents can interact with markdown as a **predictable, structured protocol**
+* Higher-layer systems (e.g. kontextual-engine) can use markitect without workarounds
+
+---
+
+## 3. Scope Definition
+
+### 3.1 In Scope
+
+* Markdown parsing into structured representations
+* Schema definition, validation, and enforcement
+* Transformation and composition (including transclusion-like mechanisms)
+* Querying and extraction of structured content
+* Templating and generation (deterministic and optionally LLM-assisted)
+* CLI-based workflows (`mkt`)
+* Programmatic API for integration
+* Caching and incremental processing for efficiency
+
+---
+
+### 3.2 Out of Scope
+
+* Persistent knowledge system or ECM functionality
+* Multi-format content management beyond markdown-native handling
+* User management, permissions, or service orchestration
+* Domain-specific knowledge models or workflows
+* Full application frameworks or agent systems
+* Secret management or infrastructure concerns
+
+---
+
+### 3.3 Boundary Clarification
+
+markitect-tool provides **primitives**, not systems.
+
+* System-level orchestration → `kontextual-engine`
+* Project-level knowledge application → `infospace-bench`
+
+---
+
+## 4. Functional Expectations
+
+### 4.1 Core Capabilities
+
+The product must support:
+
+* **Parsing & Structuring**
+  Convert markdown into machine-interpretable representations
+
+* **Validation**
+  Enforce schemas and structural constraints
+
+* **Transformation**
+  Modify and compose documents programmatically
+
+* **Templating**
+  Generate markdown from structured input + rules/templates
+
+* **Querying**
+  Extract information from documents and collections
+
+* **Automation Support**
+  Enable repeatable workflows via CLI and API
+
+---
+
+### 4.2 Interaction Modes
+
+* CLI-first interaction (`mkt`)
+* Library/API integration
+* Automation/agent execution
+
+---
+
+## 5. Non-Functional Expectations
+
+### 5.1 Performance
+
+* Efficient handling of large document sets (hundreds to thousands)
+* Incremental updates and caching to avoid redundant work
+
+---
+
+### 5.2 Reliability
+
+* Deterministic behavior for core operations
+* Explicit failure modes for invalid input or schema violations
+
+---
+
+### 5.3 Extensibility
+
+* Plugin/adaptor model for extending functionality
+* Ability to integrate LLM capabilities without coupling core logic
+
+---
+
+### 5.4 Usability
+
+* Clear CLI interface with composable commands
+* Predictable behavior across workflows
+* Minimal required configuration for common use cases
+
+---
+
+## 6. Assumptions and Dependencies
+
+### 6.1 Assumptions
+
+* Markdown remains the primary human/agent interaction format
+* Users are comfortable with CLI-based workflows
+* Structured knowledge benefits from schema-driven approaches
+
+---
+
+### 6.2 Dependencies
+
+* LLM capabilities (optional, via `llm-connect`)
+* File system as primary storage medium
+* Downstream systems for persistence and orchestration
+
+---
+
+## 7. Constraints
+
+* Must remain **implementation-agnostic at the interface level**
+* Must not introduce coupling to specific providers or platforms
+* Must maintain **clear separation from higher-layer systems**
+* Must keep core primitives **small, stable, and composable**
+
+---
+
+## 8. Risks
+
+* Scope creep toward full platform/ECM functionality
+* Over-reliance on LLMs reducing determinism
+* Complexity growth reducing usability
+* Fragmentation if primitives are not well-defined
+
+---
+
+## 9. Related Systems
+
+* **kontextual-engine** – system layer (persistence, orchestration)
+* **infospace-bench** – application layer (knowledge projects)
+* **llm-connect** – provider abstraction for LLM integration
+
+---
+
+## 10. Ecosystem Context
+
+This product is part of a layered knowledge system:
+
+```text
+markitect-tool     → makes markdown structured and manipulable
+kontextual-engine  → makes knowledge persistent and operable
+infospace-bench    → makes knowledge concrete and meaningful
+```
+
+Layers:
+
+* **Syntax layer** → markitect-tool
+* **System layer** → kontextual-engine
+* **Application layer** → infospace-bench
+
+---
+
+## 11. PRD Type
+
+**Hybrid / Boundary PRD**
+
+This PRD defines stable product intent and boundaries while allowing flexibility in implementation and evolution.
+