Files

tegwick 3264e05c0a Major overhaul of requirements for refined INTENT.md

2026-05-05 18:04:51 +02:00

15 KiB

Raw Blame History

kontextual-engine — Project Scope Suggestions

Research date: 2026-05-05
Purpose: convert market exploration into concrete scope guidance for the project and its INTENT.md.

Recommended project definition

kontextual-engine should be defined as:

A headless knowledge operations engine for turning heterogeneous information assets into persistent, contextual, governed, retrievable, transformable, and agent-operable knowledge.

This definition is broad enough to support CMS, DMS, ECM, file-service, knowledge-base, research, and AI-assistant use cases, but narrow enough to avoid becoming an unfocused clone of mature enterprise suites.

Recommended utility-demand framing

The project should start from the customer problem:

Corporate customers accumulate valuable information across files, folders, documents, records, datasets, applications, generated AI outputs, and knowledge bases. This information is economically underused because it is fragmented, inconsistently structured, weakly contextualized, hard to govern, difficult to retrieve, and unsafe to automate without explicit controls.

kontextual-engine addresses this demand by giving knowledge assets:

stable identity
metadata and context
relationships
provenance
lifecycle state
permissions and governance
search and retrieval
transformation workflows
API access
agent-safe operation

Strategic scope

In scope

kontextual-engine should provide reusable backend capabilities for:

ingesting heterogeneous information assets
representing assets as persistent entities
normalizing and extracting useful structure
assigning metadata, relationships, provenance, and lifecycle state
retrieving assets through keyword, filtered, semantic, and contextual search
transforming content into summaries, extracts, structured views, reports, and generated artifacts
orchestrating recurring knowledge workflows
exposing APIs for applications, automation systems, and AI agents
enforcing permissions, traceability, review, and governance controls

Out of scope as core identity

kontextual-engine should not define itself as:

a finished end-user CMS
a website builder
a generic office suite
a sync-and-share client
a simple file browser
a markdown-only tool
a pure vector database
a generic chatbot over documents
a single-domain knowledge base
a one-off automation script collection
a full replacement for mature ECM/DMS/records systems in its first maturity phases

These capabilities can be supported at the edges, but they should not define the engine.

Recommended differentiation

1. Context-first knowledge identity

Competitors often anchor identity in repositories, paths, records, pages, documents, or content models. kontextual-engine can differentiate by making identity more semantic and operational.

Recommended design focus:

stable asset IDs
source IDs and source aliases
semantic type
business context
relationship graph
provenance chain
lifecycle state
derived artifact lineage

2. Traceable transformations

Many systems generate summaries or extract fields, but the strategic value lies in knowing where derived knowledge came from and how it was produced.

Recommended design focus:

transformations as first-class operations
explicit input/output asset links
versioned prompts/configuration where applicable
transformation metadata
review status
reproducibility hooks
rollback or supersession semantics

3. Agent-safe operation

Agents should not be treated merely as chat UIs. Agents need permissioned, explicit, auditable operations.

Recommended design focus:

scoped tool/API permissions
actor identity for human, service, and agent actors
precondition checks
dry-run support
review gates for risky actions
audit logs
reversible changes where possible
policy violation detection

4. Composable utility layer

CMS, DMS, ECM, file-service, knowledge-base, research, and AI-assistant capabilities should be framed as utilities built on the engine.

Recommended design focus:

APIs before UI
workflows before monolithic apps
exportability
integration adapters
schema extensibility
domain-specific extensions

5. Governance without becoming bureaucratic

Governance should be a capability, not a drag on utility.

Recommended design focus:

lightweight but explicit permissions
lifecycle state
review state
retention and archival hooks
audit log by default
policy-aware retrieval and transformation

Suggested architecture-level scope boundaries

Layer	Should kontextual-engine own it?	Notes
Asset registry	Yes	Stable identity and core metadata should be central.
Source connectors	Yes, selectively	Build common connectors and allow extension. Do not try to support every enterprise app initially.
Storage abstraction	Yes	Assets may live in external systems, but the engine needs a durable representation.
Extraction / normalization	Yes	Required for search, metadata, AI, and transformations.
Search index	Yes or integrated	The engine must provide retrieval; it may use external search/vector systems internally.
Relationship graph	Yes	Core differentiator.
Workflow engine	Yes, initially simple	Needed for recurring knowledge operations and traceable transformations.
Permissions model	Yes	Must exist from the beginning even if initially simple.
Audit/provenance	Yes	Core trust capability.
End-user workspace UI	No, optional consumer	Useful later, but not the engine’s identity.
Visual website CMS	No, optional extension	Publishing can be supported through APIs.
File sync client	No	Avoid competing directly with Box, Dropbox, OneDrive, Egnyte.
Full records-management suite	Not initially	Support hooks and lifecycle state; specialized compliance can mature later.
General vector database	No	Use or integrate with search/vector systems; do not define the project as one.

Recommended first implementation wedge

The first strong wedge should be:

Ingest a heterogeneous project or organizational knowledge corpus, assign stable asset identities, extract metadata and structure, build contextual relationships, support governed retrieval, and produce traceable derived artifacts through API-accessible workflows.

This wedge demonstrates the project’s essence without requiring a full enterprise suite.

MVP capability package

Asset registry
Source ingestion for local files, markdown, PDFs, and office-like documents
Metadata extraction and manual metadata override
Stable source/provenance tracking
Search and filtered retrieval
Relationship model
Traceable transformation jobs
API access
Basic permission model
Audit log
Agent-safe operation endpoints

MVP demonstration scenarios

“Turn a project folder into a contextual knowledge space.”
“Find and cite relevant knowledge assets across mixed formats.”
“Generate a traceable summary or report from selected sources.”
“Classify and enrich assets through a reviewable workflow.”
“Expose project knowledge to an agent through controlled APIs.”

Recommended language for INTENT.md

Use language like:

“headless knowledge operations engine”
“heterogeneous information assets”
“persistent identity”
“contextual structure”
“governed access”
“retrievable meaning”
“traceable transformation”
“workflow-ready and agent-operable interfaces”

Avoid language like:

“runtime substrate” unless clarified for external readers
“system layer” without a self-contained explanation
references to other internal projects
“not the tooling layer” unless the tooling is explained generically
“AI-first” without grounding it in concrete utility

Recommended final positioning statement

kontextual-engine exists to operate knowledge assets across heterogeneous sources by giving them durable identity, contextual structure, governed access, retrievable meaning, traceable transformation, and automation-ready interfaces.

Expanded version:

It supports the utility demand behind CMS, DMS, ECM, file-service, knowledge-base, research, and AI-assistant systems without becoming any one of those products. Its core role is to provide reusable backend capabilities for making fragmented information operational.

Risks to avoid

Risk 1: Becoming too broad

Trying to be a CMS, DMS, ECM, file server, RAG system, intranet, and workflow suite at the same time will dilute implementation quality.

Mitigation:

Frame these as utility domains supported by a shared engine.
Prioritize identity, context, retrieval, transformations, workflows, and governance.

Risk 2: Becoming “chat over files”

Many AI knowledge products reduce to a chatbot over indexed documents.

Mitigation:

Make traceability, lifecycle state, transformations, review, and workflows core.

Risk 3: Ignoring permissions until later

Permission retrofits are difficult and dangerous.

Mitigation:

Model actors, roles, permissions, and audit from the beginning.

Risk 4: Overfitting to one content format

The project should handle markdown well if useful, but the market demand is heterogeneous.

Mitigation:

Treat markdown, PDFs, documents, datasets, and records as asset types, not the system identity.

Risk 5: No clear buyer/use-case anchor

A general knowledge engine can sound abstract.

Mitigation:

Anchor early demos in concrete use cases: AI-ready project corpus, document workflow automation, governed retrieval, traceable report generation, contextual knowledge base.

Recommended roadmap priorities

Phase 1 — Engine credibility

asset registry
ingestion
metadata
provenance
search
API
audit log

Phase 2 — Knowledge operation

relationships
transformations
workflow jobs
review state
permissions
derived artifacts

Phase 3 — AI and agent operation

grounded answers
citations
agent-safe APIs
dry-run and review gates
evaluation metrics
prompt/config provenance

Phase 4 — Enterprise hardening

advanced governance
retention and legal hold hooks
scaling and performance
observability
connector ecosystem
export and migration tooling

Sources consulted

Primary vendor and market sources consulted while preparing this document:

Microsoft SharePoint / SharePoint Premium: https://www.microsoft.com/en-us/microsoft-365/sharepoint/collaboration, https://support.microsoft.com/en-us/topic/ai-in-sharepoint-an-overview-c0b1efc3-81d0-4981-8be9-7ba3a75fae15
OpenText Content Cloud / AI Content Management: https://www.opentext.com/products/content-cloud, https://www.opentext.com/products/ai-content-management, https://www.opentext.com/products/core-content-management
Hyland content services / Alfresco / Nuxeo: https://www.hyland.com/en, https://www.hyland.com/en/solutions/products/alfresco-platform, https://www.hyland.com/en/solutions/products/nuxeo-platform
Box Intelligent Content Management / Box AI: https://www.box.com/home, https://www.box.com/overview, https://www.box.com/ai
M-Files: https://www.m-files.com/, https://www.m-files.com/m-files-platform/, https://www.m-files.com/press-releases/m-files-delivers-context-first-document-management-innovations/
Laserfiche: https://www.laserfiche.com/, https://www.laserfiche.com/products/ai/, https://www.laserfiche.com/products/document-and-records-management/
DocuWare: https://start.docuware.com/, https://start.docuware.com/intelligent-document-processing
Doxis / SER: https://marketplace.microsoft.com/de-de/product/saas/sergroupholdinginternationalgmbh1636119641023.doxis?tab=overview
iManage: https://imanage.com/, https://imanage.com/imanage-products/the-imanage-platform/, https://imanage.com/imanage-products/the-imanage-platform/ai/
NetDocuments: https://www.netdocuments.com/, https://www.netdocuments.com/solutions/legal-ai/
Glean: https://www.glean.com/, https://www.glean.com/product/overview, https://www.glean.com/connectors
Google Gemini Enterprise: https://docs.cloud.google.com/gemini/enterprise/docs, https://cloud.google.com/gemini-enterprise, https://cloud.google.com/gemini-enterprise/agents
Sinequa: https://www.sinequa.com/, https://www.sinequa.com/product/, https://www.sinequa.com/product/our-connectors/
Coveo: https://www.coveo.com/en, https://www.coveo.com/en/platform, https://www.coveo.com/en/platform/generative-ai
Elastic: https://www.elastic.co/enterprise-search, https://www.elastic.co/enterprise-search/vector-search
Dropbox Dash: https://dash.dropbox.com/, https://dash.dropbox.com/features/universal-search, https://dash.dropbox.com/security
Contentful: https://www.contentful.com/, https://www.contentful.com/solutions/composable-content-platform/, https://www.contentful.com/composable-content/
Contentstack: https://www.contentstack.com/, https://www.contentstack.com/platforms/headless-cms, https://www.contentstack.com/platform
Sanity: https://www.sanity.io/, https://www.sanity.io/content-lake, https://www.sanity.io/docs/getting-started/the-sanity-content-operating-system-an-introduction
Adobe Experience Manager / GenStudio: https://business.adobe.com/products/experience-manager/adobe-experience-manager.html, https://business.adobe.com/products/experience-manager/sites.html, https://business.adobe.com/products/experience-manager/assets.html, https://business.adobe.com/solutions/content-supply-chain.html
Atlassian Confluence: https://www.atlassian.com/software/confluence
Notion: https://www.notion.com/, https://www.notion.com/product/agents, https://www.notion.com/product/wikis
Guru: https://www.getguru.com/, https://www.getguru.com/solutions/ai-enterprise-search, https://help.getguru.com/docs/what-is-verifcation
ServiceNow Knowledge Management: https://www.servicenow.com/platform/knowledge-management.html, https://www.servicenow.com/docs/r/servicenow-platform/knowledge-management/knowledge-management.html
Strapi: https://strapi.io/, https://strapi.io/headless-cms
Directus: https://directus.io/, https://directus.io/toolkit/connect, https://directus.io/features/existing-database
Forrester content platforms market framing: https://www.forrester.com/blogs/highlights-from-the-forrester-wave-content-platforms-q1-2025/
McKinsey generative AI economic potential: https://www.mckinsey.com/capabilities/tech-and-ai/our-insights/the-economic-potential-of-generative-ai-the-next-productivity-frontier
AIIM Intelligent Information Management 2025: https://info.aiim.org/state-of-the-intelligent-information-management-industry-2025

Research date: 2026-05-05.

15 KiB Raw Blame History Unescape Escape

kontextual-engine — Project Scope Suggestions

Recommended project definition

Recommended utility-demand framing

Strategic scope

In scope

Out of scope as core identity

Recommended differentiation

1. Context-first knowledge identity

2. Traceable transformations

3. Agent-safe operation

4. Composable utility layer

5. Governance without becoming bureaucratic

Suggested architecture-level scope boundaries

Recommended first implementation wedge

MVP capability package

MVP demonstration scenarios

Recommended language for INTENT.md

Recommended final positioning statement

Risks to avoid

Risk 1: Becoming too broad

Risk 2: Becoming “chat over files”

Risk 3: Ignoring permissions until later

Risk 4: Overfitting to one content format

Risk 5: No clear buyer/use-case anchor

Recommended roadmap priorities

Phase 1 — Engine credibility

Phase 2 — Knowledge operation

Phase 3 — AI and agent operation

Phase 4 — Enterprise hardening

Sources consulted

15 KiB

Raw Blame History