Files
kontextual-engine/docs/current-state-overlap-review.md
2026-05-06 17:22:51 +02:00

9.2 KiB

Current State Overlap Review

Date: 2026-05-06

Purpose

Compare the current kontextual-engine implementation with /home/worsch/markitect-main and /home/worsch/markitect-tool after completion of KONT-WP-0007.

The review asks two questions:

  1. Did we capture the useful successor scope from markitect-main?
  2. Did we accidentally create unhealthy overlap with markitect-tool?

Inputs Reviewed

  • Current engine implementation in src/kontextual_engine/.
  • Current workplans, especially KONT-WP-0005, KONT-WP-0006, and KONT-WP-0007.
  • Existing boundary notes:
    • docs/markitect-main-scope-assessment.md
    • docs/markitect-tool-reuse-boundary.md
    • docs/markitect-tool-integration-usecases.md
    • docs/system-layer-migration-backlog.md
  • markitect-main areas:
    • markitect/infospace/
    • markitect/assets/
    • markitect/spaces/
    • markitect/prompts/
    • markitect/query_paradigms/
    • infrastructure/repositories/
  • markitect-tool public API and source areas:
    • core, query, ops, backend, memory, policy, contract, schema, workflow, runtime, reference, and document_function.

Current Engine State

The engine now has a coherent runtime foundation:

  • canonical asset identity through KnowledgeAsset,
  • separate source, normalized, and derived representations,
  • source references, metadata records, lifecycle, versions, audit, policy, and idempotency,
  • durable memory and SQLite repository adapters,
  • ingestion adapters for local files, text, document metadata, datasets, and Markdown via markitect-tool,
  • governed retrieval with lexical search, filters, contextual entities, relationships, policy filtering, snippets, feedback, and KPI hooks.

This is now materially beyond the original minimal artifact/query scaffold. The canonical implementation is the core/, ports/, services/, and adapters/ architecture. The older simple modules such as artifacts.py, storage.py, query.py, context.py, ingestion.py, and relationships.py remain useful as compatibility scaffolding and migrated seed tests, but they should not be treated as the long-term canonical model.

Comparison With markitect-main

Healthy Successor Coverage

markitect-main mixed many concepts into one project: Markdown syntax tooling, infospace experiments, assets, spaces, prompt workflow machinery, query paradigms, UI, finance, issue tracking, provider adapters, and repository infrastructure.

The current engine has correctly lifted the system-layer concepts instead of porting old package structure directly.

markitect-main concept Current engine successor Assessment
markitect/assets/* content-addressed asset records KnowledgeAsset, AssetRepresentation, SourceReference, repository adapters Healthy reimplementation. The engine model is more governed and cross-format.
markitect/infospace/models.py entity metadata ContextEntity, metadata records, asset classification Healthy abstraction. Domain-specific section fields were not copied.
markitect/infospace/relation_models.py triplets CoreRelationship with target kind, confidence, actor, provenance, validity windows Healthy reimplementation. More generic than VSM-specific relation metadata.
markitect/spaces/models.py information spaces Partly covered by collections/tags/source context metadata, not yet a first-class scope container Gap remains. A future collection/scope model should avoid recreating old rendering-oriented InformationSpace.
markitect/query_paradigms/base.py generic QueryResult AssetQueryResult, ContextEntityQueryResult, RelationshipQueryResult Healthy reimplementation. The engine now owns stable operational query contracts rather than query-paradigm plugins.
markitect/prompts/* artifact, dependency, quality, run, lineage concepts Workplans KONT-WP-0008 and KONT-WP-0010 Not implemented yet. These should influence transformation/workflow work next.
infrastructure/repositories/* SQLite/filesystem lessons Engine repository ports and SQLite adapter Healthy reimplementation. The old repository shape was document/workspace-specific and async-heavy.

Remaining markitect-main Gaps

The largest successor gaps are not retrieval gaps anymore. They are workflow and operation-state gaps:

  • transformation runs and derived artifact lineage,
  • workflow templates, step state, retries, review gates, and failures,
  • quality gates, impact debt, recomputation, and traceability,
  • first-class collection/scope membership,
  • service/API surfaces and agent-safe operation envelopes,
  • export and enterprise-readiness concerns.

These are already covered by later workplans, especially KONT-WP-0008, KONT-WP-0009, and KONT-WP-0010.

Comparison With markitect-tool

Healthy Boundary

The current implementation mostly respects the intended split:

  • markitect-tool owns Markdown parsing, selectors, transforms, includes, document contracts, Markdown schema validation, local snapshot identity, Markdown context packages, and Markdown-centered workflows.
  • kontextual-engine owns governed asset state, metadata, lifecycle, policy, audit, cross-format retrieval, relationship/context graph, feedback, and KPI hooks.

The executable boundary checks pass against the sibling checkout:

PYTHONPATH=/home/worsch/kontextual-engine/src:/home/worsch/markitect-tool/src \
  python3 -m pytest tests/test_markitect_tool_contract.py \
    tests/test_markitect_ingestion_adapter.py -q

10 passed

The current Markdown ingestion adapter delegates parsing and snapshot identity to public markitect_tool APIs. It persists serializable normalized representations and adapter metadata rather than storing Markitect runtime objects as engine domain state.

No Unhealthy Overlap Found

No current engine code reimplements these Markitect-owned capabilities:

  • Markdown AST construction,
  • selector language parsing,
  • Markitect document extraction,
  • Markdown transforms/includes/composition,
  • Markdown document contracts,
  • Markdown document schema validation,
  • Markitect context package activation,
  • Markitect local snapshot identity.

That is the important line, and we are still on the correct side of it.

Watchlist: Benign Today, Risky If Expanded

Area Current state Risk Recommended guardrail
Lexical retrieval index Engine builds an in-memory substring index over normalized representation search_text. Healthy as a cross-format MVP, but could become a duplicate of Markitect local index/FTS if expanded for Markdown-specific search. Keep engine search backend-neutral. For durable Markdown FTS, wrap markitect_tool.backend.IndexBackend or query adapters instead of rebuilding Markitect local index semantics.
Source-grounded snippets Engine creates offset snippets from normalized search text and carries Markitect provenance if present. Healthy as cross-format fallback, but exact Markdown section/block snippets should not grow into a second selector engine. For Markdown-specific snippets, call markitect_tool.query/reference through an adapter and persist selector/source-span provenance.
Policy primitives Engine and Markitect both have a PolicyDecision concept. Names overlap, but scopes differ: engine policy gates governed asset operations; Markitect policy filters Markdown objects/context packages. Keep separate models and add explicit adapter mapping when using Markitect local/enterprise policy for Markdown-backed context.
Context packages Engine still has an older simple ContextPackage scaffold while Markitect has rich Markdown context packages. Future agent context work could duplicate Markitect package behavior. In KONT-WP-0009, treat Markitect context packages as Markdown adapter payloads and make engine context packages cross-format, audited, and policy-aware.
Adapter post-processing MarkitectMarkdownExtractor derives links/tables from serialized token/block payloads. Low risk, but depends on Markitect serialization details. Prefer public Markitect fields if they become available; add contract tests for link/table/source-span stability if these fields become operationally important.

Recommendations

  1. Treat core/, ports/, services/, and adapters/ as canonical. Plan a cleanup pass to deprecate or quarantine the older simple artifacts.py/query.py/context.py scaffold once successor contracts are fully covered.
  2. Proceed to KONT-WP-0008 using markitect-main prompt/workflow tests as behavioral reference, not as code to port directly.
  3. For Markdown transformations in KONT-WP-0008, use markitect_tool.ops and markitect_tool.workflow adapters. The engine should persist runs, inputs, outputs, decisions, provenance, and derived artifacts.
  4. For future Markdown snippet precision, add an adapter that calls markitect_tool.query and markitect_tool.reference instead of expanding engine substring search into a Markdown selector system.
  5. Add a small policy mapping contract before integrating Markitect local or enterprise policy into engine retrieval/context packages.
  6. Keep the current Markitect boundary tests in CI or at least in the standard integration check profile. They are doing exactly the right job.