Major overhaul of requirements for refined INTENT.md

This commit is contained in:
2026-05-05 18:04:51 +02:00
parent e201d37f96
commit 3264e05c0a
10 changed files with 2659 additions and 354 deletions

238
INTENT.md
View File

@@ -2,111 +2,237 @@
## Purpose
This repository exists to provide an **AI-first, headless knowledge and content engine** for managing, transforming, and operating structured information across heterogeneous data sources.
`kontextual-engine` exists to provide a **headless knowledge operations engine** for turning heterogeneous information assets into persistent, contextual, governed, retrievable, transformable, and agent-operable knowledge.
It enables persistent, service-based knowledge systems that support **efficient research, composition, and reuse of information**.
The project addresses the utility demand behind systems such as content management, document management, enterprise content management, file services, knowledge bases, research repositories, and AI-assisted knowledge workflows. It is not limited to any one of those categories. Its role is to provide reusable backend capabilities for making fragmented information operational.
`kontextual-engine` should help people, teams, applications, automation systems, and AI agents work with knowledge assets across different sources, formats, domains, and lifecycle states.
---
## Utility Demand
Organizations and individuals accumulate valuable information in fragmented forms:
* files and folders
* markdown and text repositories
* office documents
* PDFs
* datasets
* notes
* records
* policies
* project documentation
* knowledge-base articles
* generated AI outputs
* operational documents
* content archives
* application-linked documents and records
These assets often remain economically underused because they are disconnected, inconsistently structured, weakly contextualized, difficult to govern, hard to retrieve, and unsafe to automate without explicit controls.
`kontextual-engine` exists to solve this problem by giving knowledge assets durable identity, contextual structure, governed access, retrievable meaning, traceable transformation, and automation-ready interfaces.
It is not merely a storage layer. It is an engine for making knowledge operational.
---
## Primary Utility
The repository provides a **runtime system and service layer** that:
The repository provides a **runtime and service layer for knowledge operations**.
* Manages knowledge as persistent, structured collections across projects and domains
* Integrates and normalizes data from multiple formats (markdown, documents, datasets, files)
* Orchestrates transformation workflows, including templating, generation, and analysis
* Provides APIs and service endpoints for accessing and operating on knowledge
* Supports AI-driven interaction, automation, and augmentation of knowledge processes
It is intended to support:
It transforms static content into a **living, operable knowledge system**.
* ingestion of knowledge assets from multiple sources and formats
* persistent representation of assets with stable identity
* extraction and normalization of useful structure, metadata, and content
* contextualization through metadata, relationships, provenance, classification, and lifecycle state
* retrieval through search, filtering, querying, browsing, APIs, and agent-compatible access patterns
* transformation of content into summaries, extracts, structured representations, generated artifacts, reports, views, or downstream formats
* workflow orchestration for recurring knowledge operations such as ingestion, enrichment, validation, review, publication, archival, and synchronization
* governed access through permissions, auditability, traceability, review state, and operational controls
* AI-assisted and agent-safe operation through explicit, permissioned, and auditable interfaces
The core value of `kontextual-engine` is to make knowledge **durable, addressable, contextual, searchable, transformable, governable, and operationally useful**.
---
## Intended Users
* Developers building knowledge-driven applications and services
* Infrastructure operators (`adm`) managing knowledge systems and deployments
* Automation systems (`atm`) orchestrating workflows and transformations
* LLM agents (`agt`) interacting with and evolving structured knowledge environments
`kontextual-engine` is intended for:
* developers building knowledge-driven applications and services
* teams that need structured access to documents, content, files, records, and datasets
* operators managing durable knowledge services
* product builders creating CMS, DMS, ECM, knowledge-base, research-support, file-service-like, or AI-assistant-backed systems
* automation systems that need reliable access to contextual information
* AI agents that need to inspect, retrieve, transform, enrich, and maintain knowledge assets
* researchers, analysts, and knowledge workers managing evolving collections of information
The system should be usable by humans through applications and by machines through APIs, workflows, and controlled agent interfaces.
---
## Strategic Role in the System
This repository is part of a layered knowledge system with clearly separated responsibilities:
## Strategic Role
- markitect-tool → makes markdown structured and manipulable
- **kontextual-engine** → makes knowledge persistent and operable
- infospace-bench → makes knowledge concrete and meaningful
`kontextual-engine` serves as a **knowledge operations engine**.
These layers correspond to a deliberate separation of concerns:
Its role is to provide reusable backend capabilities for managing knowledge as an active operational resource rather than as passive content.
* **Syntax layer** — structuring and transforming semi-structured data (markdown)
* **System layer** — operating, persisting, and orchestrating knowledge
* **Application layer** — applying knowledge systems to real-world contexts
This includes:
This repository occupies the **system layer** and should maintain **clear boundaries** to the others.
* asset identity
* persistence
* ingestion
* normalization
* metadata
* contextual relationships
* indexing and retrieval
* transformation
* workflow execution
* permissions and access control
* provenance and traceability
* governance hooks
* integration interfaces
* agent-oriented operation
This repository acts as the **headless knowledge engine layer**:
The project should remain focused on the engine layer: the durable runtime capabilities needed to operate knowledge systems across many domains, applications, and deployment models.
* It sits above tool-level primitives (e.g. `markitect-tool`)
* It provides **persistence, orchestration, and access** to knowledge systems
* It enables **AI-native workflows** over structured and semi-structured data
* It supports multiple interaction modes: API, service, and agent-driven
It should not be constrained to a single content format, user interface, application domain, storage backend, AI model, or deployment scenario.
It is the **runtime substrate for knowledge systems**, not the tooling layer.
---
## Core Capabilities
A mature `kontextual-engine` should provide capabilities in the following areas.
### Knowledge Asset Management
The system should manage knowledge assets as persistent entities with stable identity, metadata, relationships, provenance, versions, permissions, and lifecycle state.
### Multi-Format Ingestion
The system should ingest and normalize information from heterogeneous sources and formats, including text files, markdown, office documents, PDFs, datasets, structured records, generated outputs, and other content sources.
### Contextualization
The system should enrich knowledge assets with context such as tags, classifications, links, references, provenance, ownership, source information, temporal information, semantic annotations, review state, and derived relationships.
### Retrieval and Access
The system should expose knowledge through search, filtering, querying, browsing, APIs, and agent-compatible access patterns while respecting permissions and operational constraints.
### Transformation
The system should support controlled transformation of knowledge assets into summaries, extracts, structured representations, generated artifacts, reports, views, and downstream formats.
Transformations should be traceable to their inputs, configuration, actor, workflow, and output artifacts.
### Workflow Operation
The system should support repeatable knowledge workflows such as ingestion, classification, validation, enrichment, review, approval, publication, archival, synchronization, and exception handling.
### Governance and Traceability
The system should preserve enough operational history to understand where knowledge came from, how it changed, who or what acted on it, which permissions applied, and what downstream artifacts depend on it.
### AI-Assisted and Agent-Safe Operation
The system should be designed so that AI agents can safely inspect, retrieve, transform, classify, enrich, and maintain knowledge assets through explicit interfaces and controlled workflows.
Agent operation should be permissioned, auditable, reviewable, and reversible where practical.
---
## Strategic Boundaries
This repository is **not** intended to:
This repository is **not** intended to be:
* Replace low-level tooling for markdown or structured content manipulation
* Be constrained to markdown as a primary format
* Define end-user projects, experiments, or domain-specific knowledge spaces
* Act as a simple CLI toolkit
* a single-purpose document editor
* a simple file browser
* a format-specific markdown tool
* a pure vector database
* a generic chatbot over documents
* a finished end-user CMS by itself
* a visual website builder
* a file-sync client
* a domain-specific knowledge base
* a one-off automation script collection
* a full replacement for specialized authoring, publishing, legal, records-management, or analytical tools
Such concerns belong to:
Instead, it should provide reusable backend capabilities that such systems may depend on.
* `markitect-tool` (tooling layer)
* `infospace-bench` (project/workspace layer)
It may support user interfaces, command-line tools, importers, exporters, connectors, dashboards, and domain-specific applications, but those should remain consumers or extensions of the engine rather than the core identity of the project.
---
## Design Principles
* **AI-first operation**
The system is designed for interaction and orchestration by LLM agents
### Utility before presentation
* **Format-agnostic knowledge handling**
All data types are supported; markdown may serve as an interaction layer, not a constraint
The engine should focus first on making knowledge operationally useful. User interfaces and presentation layers may be built on top, but they should not define the core architecture.
* **Separation of concerns**
Tooling, runtime, and project layers are explicitly decoupled
### Format agnosticism
* **Persistent knowledge state**
Knowledge is stored, versioned, and evolved over time
The system should support many content types and should not be constrained by one preferred authoring format.
* **Operational composability**
Workflows are built from reusable, orchestratable primitives
### Persistent knowledge state
Knowledge assets should have durable identity, lifecycle state, metadata, relationships, provenance, permissions, and operational history.
### Context as a first-class concern
The system should treat relationships, provenance, classification, lifecycle state, and usage context as core information, not as secondary decoration.
### Traceable transformation
Generated summaries, derived artifacts, classifications, extractions, and other transformations should remain linked to their source assets and workflow context.
### API-first and automation-ready
The system should expose stable interfaces suitable for applications, services, scripts, workflows, and AI agents.
### Agent-safe operation
AI agents should operate through explicit, permissioned, auditable, and bounded interfaces. Risky operations should support review gates, dry runs, or reversible workflows where appropriate.
### Composable operation
Knowledge operations should be built from reusable capabilities that can be combined into workflows.
### Human and agent collaboration
The system should support both human-directed and AI-assisted knowledge work, with clear ownership, permissions, review mechanisms, and traceability.
### Separation of engine and application
The repository should provide reusable engine capabilities rather than hard-coding one specific application, domain, user experience, storage backend, or AI model.
---
## Maturity Target
A mature version of this repository should:
A mature version of `kontextual-engine` should act as a robust, scalable backend for governed, AI-assisted knowledge management.
* Provide a **robust, scalable runtime for knowledge systems**
* Support **multi-format ingestion, transformation, and retrieval**
* Enable **fully automated and agent-driven knowledge workflows**
* Expose stable APIs for integration with external systems
* Act as the **default engine for AI-driven knowledge management**
It should be able to:
* ingest and manage heterogeneous knowledge assets
* maintain persistent and traceable knowledge state
* represent context through metadata, relationships, provenance, and lifecycle state
* expose reliable APIs for applications, automation systems, and AI agents
* support search, retrieval, transformation, and workflow execution
* enforce permissions, auditability, review, and governance controls
* integrate with external storage, document, content, data, and search systems
* enable AI agents to operate knowledge safely and effectively
* support CMS, DMS, ECM, file-service, knowledge-base, research-support, and AI-assistant use cases
* serve as a reusable foundation for knowledge-driven products and platforms
The long-term goal is to make `kontextual-engine` a default backend engine for systems that need to turn fragmented information into structured, contextual, governed, and operational knowledge.
---
## Stability Note
Changes to this file represent a **deliberate shift in the systems role as a knowledge engine and runtime layer**.
Such changes should be made with explicit architectural intent, as they affect all dependent systems and projects.
Changes to this file should represent deliberate changes to the intended role of the repository.
Because this document defines the projects durable purpose, it should remain more stable than implementation details, feature plans, vendor comparisons, deployment-specific architecture decisions, or temporary implementation constraints.