Complete system-layer extraction plan

This commit is contained in:
2026-05-05 01:17:48 +02:00
parent 67010a0429
commit 902ba7352d
8 changed files with 345 additions and 21 deletions

View File

@@ -2,7 +2,7 @@
# Custodian Brief — kontextual-engine
**Domain:** markitect
**Last synced:** 2026-05-04 22:52 UTC
**Last synced:** 2026-05-04 23:16 UTC
**State Hub:** http://127.0.0.1:8000 *(adjust if running on a remote machine)*
## Active Workstreams
@@ -20,16 +20,6 @@ Progress: 0/8 done | workstream_id: `0fd08391-e8c9-4f1b-ace4-06439f958e88`
- · I3.7 - Implement agent context surface `96689817`
- … and 1 more open tasks
### markitect-main System-Layer Extraction
Progress: 1/6 done | workstream_id: `e46d0962-7451-4b6c-b39f-461e35ba6a76`
**Open tasks:**
- · S2.2 - Inventory persistence and repository code `86a1bf90`
- · S2.3 - Inventory infospace models and relationships `8b88b3fa`
- · S2.4 - Inventory orchestration and run-manifest material `1f15f603`
- · S2.5 - Inventory API and query experiments `0a1e5a4b`
- · S2.6 - Produce migration backlog `54a7e7a7`
---
## MCP Orientation (when available)

View File

@@ -11,6 +11,9 @@ Start here:
- `SCOPE.md`
- `docs/stack-decision.md`
- `docs/markitect-main-scope-assessment.md`
- `docs/markitect-tool-reuse-boundary.md`
- `docs/system-layer-extraction-inventory.md`
- `docs/system-layer-migration-backlog.md`
- `workplans/`
## Development

View File

@@ -107,7 +107,7 @@ belong in `infospace-bench`.
---
## Related / Overlapping Repositories
## Related / Overlapping
- `markitect-main` — legacy mixed platform; source for candidate behavior and
tests, not the target architecture.
@@ -144,4 +144,3 @@ title: Agent-operable knowledge workflows
description: Provides planned APIs and workflow surfaces that let agents access context, trigger transformations, and operate over durable knowledge state.
keywords: [agent, workflow, context, automation, knowledge]
```

View File

@@ -15,6 +15,12 @@ The most important inheritance is not old module structure. It is the concept
of a durable infospace-like knowledge environment with typed artifacts,
relationships, evaluation/composition workflows, and agent-operable context.
Detailed follow-up:
- `docs/markitect-tool-reuse-boundary.md`
- `docs/system-layer-extraction-inventory.md`
- `docs/system-layer-migration-backlog.md`
## In-Scope Candidates
| FRS area | markitect-main evidence | Recommendation |
@@ -70,4 +76,3 @@ kontextual_engine/
The first implementation workplan should validate this shape against migrated
tests before committing to a framework or storage backend.

View File

@@ -0,0 +1,47 @@
# markitect-tool Reuse Boundary
Date: 2026-05-05
## Purpose
This note records what `kontextual-engine` should reuse from
`markitect-tool` instead of reimplementing. `markitect-tool` is the syntax
layer; `kontextual-engine` is the system/runtime layer.
## Reuse As Adapter Dependencies
| Need in kontextual-engine | markitect-tool owner | Reuse posture |
| --- | --- | --- |
| Markdown parsing and structured document snapshots | `markitect_tool.core.parser`, `markitect_tool.core.document`, `markitect_tool.backend.engine.DocumentSnapshot` | Call through a markdown ingestion adapter. Persist normalized artifacts here, but do not parse markdown here. |
| Document-level selectors and extraction | `markitect_tool.query`, `docs/query-extraction.md` | Use for markdown source extraction and context package creation. Engine query should operate over persisted artifacts and relationships. |
| Deterministic transforms, composition, and includes | `markitect_tool.ops.engine`, `docs/transform-compose-include.md` | Treat as external operations invoked by workflows. Store operation provenance and derived artifacts in the engine. |
| Contract checks, runtime context, forms, and assessments | `markitect_tool.contract.*`, `markitect_tool.runtime.*`, `docs/runtime-context-forms-assessments.md` | Use as validation/assessment step adapters. Engine owns run state and audit trail. |
| Backend manifests, local snapshots, FTS, and query adapters | `markitect_tool.backend.*`, `docs/backend-fabric.md` | Reuse snapshot identity and local index concepts. Engine storage remains separate and cross-format. |
| Agent working memory context packages | `markitect_tool.memory.engine`, `docs/agent-working-memory.md` | Reuse as a portable context-package format for markdown-backed context. Engine should provide durable context registries across formats. |
| Workflow definition syntax and markdown-centered step kinds | `markitect_tool.workflow.*`, `docs/workflow-definition-standard.md` | Reuse where workflows consume markdown inputs. Engine workflows should generalize to artifact collections, external tools, and service operations. |
| Document functions, templates, and generation hooks | `markitect_tool.document_function`, `markitect_tool.generation` | Invoke as syntax-layer processors. Keep provider calls behind `llm-connect`. |
| Local label policy and policy adapter protocols | `markitect_tool.policy.*` | Reuse for markdown source/package filtering. Engine should expose policy-aware operations at artifact/service level. |
## Do Not Reimplement Here
- Markdown ASTs, section trees, frontmatter parsing, explode/implode, document
transforms, includes, contract files, and selector syntax.
- Local markdown FTS/index refresh internals unless a durable engine backend
explicitly wraps them.
- CLI-first command behavior from `mkt`.
- Provider-specific LLM adapters or prompt adapter internals.
## Engine-Owned Responsibilities
`kontextual-engine` should own the durable runtime concepts that sit above
`markitect-tool`:
- Artifact and collection identity across formats.
- Persistent artifact metadata, content digests, lineage, and lifecycle state.
- Relationship graphs between artifacts and collections.
- Workflow runs, step state, input/output bundles, errors, and provenance.
- Cross-artifact dependency graph, recomputation, and impact debt.
- Agent-operable context continuity and service/programmatic APIs.
- Adapter registry that can call `markitect-tool`, `llm-connect`, and storage
backends without embedding their internals.

View File

@@ -0,0 +1,166 @@
# System-Layer Extraction Inventory
Date: 2026-05-05
Sources reviewed:
- `/home/worsch/markitect-main/infrastructure/`
- `/home/worsch/markitect-main/markitect/infospace/`
- `/home/worsch/markitect-main/markitect/prompts/`
- `/home/worsch/markitect-main/migrations/prompts/`
- `/home/worsch/markitect-main/roadmap/prompt-dependency-resolution/`
- `/home/worsch/markitect-main/markitect/query_paradigms/`
- `/home/worsch/markitect-main/markitect/plugins/builtin/search/`
- `/home/worsch/markitect-tool/`
## Executive Classification
| Area | Legacy evidence | Destination |
| --- | --- | --- |
| Addressable artifacts and digests | `markitect/prompts/models.py`, `migrations/prompts/001_create_artifacts_table.sql`, prompt tests | Reimplement in `kontextual-engine` as core artifact model. |
| Artifact storage/repositories | `markitect/prompts/repositories/`, `infrastructure/repositories/sqlite_repository.py` | Reimplement storage interface; use old code as behavior reference only. |
| Resolution across spaces | `markitect/prompts/resolver/`, `002_create_resolution_config.sql` | Reimplement as collection/context resolution policy. |
| Runs and manifests | `markitect/prompts/execution/`, `003_create_runs_and_manifests.sql`, `RunManifestSchema.md` | Reimplement as workflow run model. |
| Dependency graph | `markitect/prompts/dependencies/`, `004_create_dependencies.sql` | Reimplement with artifact relationship graph semantics. |
| Incremental recompute and impact debt | `markitect/prompts/incremental/`, `005_create_changes_and_debt.sql` | Reimplement after artifact/run persistence exists. |
| Quality gates and halting | `markitect/prompts/quality/`, `006_create_quality_tables.sql` | Reimplement as provider-neutral validation policy; delegate markdown validation to `markitect-tool`. |
| Infospace entities and relationships | `markitect/infospace/models.py`, `relation_models.py`, `graph_export.py`, `state.py`, `evaluation.py` | Extract vocabulary and tests; generalize beyond markdown/project directories. |
| Source pipelines | `markitect/infospace/pipeline.py` | Reimplement as engine workflow concepts; delegate markdown/template operations to adapters. |
| Query paradigms and search | `markitect/query_paradigms/`, FTS plugin | Use as design evidence; reuse `markitect-tool` query/index APIs instead. |
| Production/config/logging utilities | `markitect/production/`, `infrastructure/logging/` | Mostly out of scope; keep structured error/audit ideas. |
## Persistence And Repository Findings
Relevant legacy concepts:
- Artifact identity: UUID, `space_id`, name, type, content digest, content
size, metadata, timestamps.
- Content addressing: SHA-256 digest used for change detection and
idempotency.
- Repository behavior: create/read/update/delete, duplicate detection,
lookup by name or digest, pagination, structured errors.
- SQLite tables: `prompt_artifacts`, `prompt_resolution_config`,
`prompt_runs`, `run_manifests`, `prompt_dependencies`,
`artifact_changes`, `impact_debt`, `quality_gates`,
`validation_results`.
Extraction decision:
- Reimplement artifact persistence around engine-owned `Artifact`,
`Collection`, `ArtifactVersion`, `Relationship`, and `OperationRun`.
- Do not reuse legacy repository classes directly; they mix issue/project
repositories, old workspace assumptions, and Gitea-specific error paths.
- Preserve tests for digest determinism, artifact reference parsing,
duplicate handling, idempotency hashes, and dependency graph operations.
## Infospace Model And Relationship Findings
Relevant legacy concepts:
- `EntityMeta`: slug, title, typed section contents, derived metrics, source
path, section slugs.
- `RelationMeta`: subject, predicate, object, relation type, evidence, source
path, edge tuple.
- `InfospaceState`: aggregate state with entity count, domains, latest
evaluation snapshot, viability checks.
- `EvaluationSnapshot`: per-entity scores, collection metrics, score/metric
diffs.
- `EntityGraph`: nodes, multiedges, feedback loop membership, filters, export.
- Discipline composition: reusable conceptual collection, path resolution,
viability, stale mapping detection.
Extraction decision:
- Convert `EntityMeta` into a generic artifact facet, not an engine-only entity
type.
- Convert `RelationMeta` into first-class relationship records with typed
predicates, provenance, and evidence.
- Keep evaluation snapshots as generic assessment snapshots attached to
artifacts, collections, or workflow runs.
- Keep discipline composition as collection dependency and mapping freshness,
without assuming markdown directories.
- Use `markitect-tool` for parsing markdown entities and extracting sections.
## Orchestration And Run Manifest Findings
Relevant legacy concepts:
- Prompt run stages: template analysis, context compilation, prompt processing.
- Input bundle hash: template digest, ordered dependency digests, resolver
config, model settings, compilation options.
- Nested generator runs with `parent_run_id` and depth limits.
- Run manifest: resolved inputs, compiled prompt digest, model config,
outputs, dependency edges, validation results, impact debt, timing metadata.
- Recompute defaults: depth 1, circular suppression, budget limits, impact
thresholds.
- Traceability: produced artifacts link back to templates, inputs, runs, and
dependency edges.
Extraction decision:
- Generalize `PromptRun` into `OperationRun` / `WorkflowRun`, where prompt
execution is one operation kind.
- Keep the input bundle hash concept as a universal idempotency key for
deterministic and assisted operations.
- Keep run manifests as durable, inspectable records. Prefer a Pydantic model
before any database schema.
- Do not embed LLM provider execution; route provider calls through
`llm-connect` adapters.
## API, Query, And Retrieval Findings
Relevant legacy concepts:
- `QueryResult` standard envelope: paradigm, query, timing, count, results,
metadata, success/error.
- Query registry: discoverable paradigms by name/category/complexity.
- SQLite FTS indexer: FTS5 tables, rebuild, optimize, availability checks.
- GraphQL/REST paradigm files exist but are better evidence of integration
desires than a target API contract.
Extraction decision:
- Engine query should return stable result envelopes with metadata,
provenance, and structured errors.
- Query should operate over artifact ids, metadata, relationships, content
references, run records, and assessment snapshots.
- Reuse `markitect-tool` selector/query and local index APIs for markdown
content. Do not port the old query paradigm registry wholesale.
## Candidate Test Sources
High-value legacy tests to mine:
- `tests/unit/prompts/test_artifact_models.py`
- `tests/unit/prompts/test_artifact_repository.py`
- `tests/unit/prompts/test_template_models.py`
- `tests/unit/prompts/test_macro_parser.py`
- `tests/unit/prompts/test_resolution_strategy.py`
- `tests/unit/prompts/test_context_compiler.py`
- `tests/unit/prompts/test_execution_models.py`
- `tests/unit/prompts/test_execution_engine.py`
- `tests/unit/prompts/test_dependency_models.py`
- `tests/unit/prompts/test_dependency_repository.py`
- `tests/unit/prompts/test_incremental_engine.py`
- `tests/unit/prompts/test_change_detector.py`
- `tests/unit/prompts/test_impact_analyzer.py`
- `tests/unit/prompts/test_quality_gates.py`
- `tests/unit/prompts/test_halting_policy.py`
- `tests/unit/prompts/test_traceability_service.py`
- `tests/integration/prompts/test_dependency_graph.py`
- `tests/integration/prompts/test_incremental_recompute.py`
- `tests/integration/prompts/test_quality_validation.py`
- `tests/integration/prompts/test_traceability_workflow.py`
- `tests/unit/infospace/test_entity_parser.py`
- `tests/unit/infospace/test_evaluation.py`
- `tests/unit/infospace/test_composition.py`
- `tests/unit/infospace/test_checks.py`
- `markitect/query_paradigms/tests/test_query_paradigms.py`
Test migration posture:
- Migrate expected behavior, not imports.
- Rewrite around `kontextual_engine.*` contracts.
- Keep markdown parsing fixtures behind `markitect-tool` adapters.
- Treat prompt tests as the first source for `KONT-WP-0003` unit tests.

View File

@@ -0,0 +1,103 @@
# System-Layer Migration Backlog
Date: 2026-05-05
This backlog is the output of `KONT-WP-0002`. It feeds
`KONT-WP-0003: Headless Knowledge Engine Implementation`.
## Migration Strategies
- `migrate-test`: port or rewrite old tests first.
- `reimplement`: build a new implementation around the `kontextual-engine`
PRD/FRS.
- `adapter`: call another repo, especially `markitect-tool` or `llm-connect`.
- `defer`: keep documented, but do not build in the first implementation slice.
- `out-of-scope`: leave in legacy or another repo.
## P0 - Contract Baseline
| Item | Strategy | Source | Output |
| --- | --- | --- | --- |
| Define engine model names and module layout | reimplement | PRD/FRS, scope assessment | `kontextual_engine/artifacts`, `collections`, `storage`, `relationships`, `workflows`, `query`, `context`, `integrations` |
| Structured error envelope | reimplement | `infrastructure/exceptions.py`, `production/error_handler.py` | Common `KontextualError` / diagnostic model |
| Adapter boundary to `markitect-tool` | adapter | `docs/markitect-tool-reuse-boundary.md` | Markdown ingestion adapter protocol |
## P1 - Artifacts, Collections, And Persistence
| Item | Strategy | Source | Acceptance direction |
| --- | --- | --- | --- |
| `Artifact` with id, collection id, name, type, digest, size, metadata, timestamps | migrate-test + reimplement | `markitect/prompts/models.py`, `test_artifact_models.py` | Deterministic digest and serialization tests pass. |
| `ArtifactReference` / address parser | migrate-test + reimplement | `ArtifactReference.parse` tests | Supports local, collection-qualified, and versioned refs. |
| `Collection` model replacing legacy `InformationSpace` assumptions | reimplement | infospace config/state concepts | Supports nested or related collections without filesystem coupling. |
| Storage repository protocol | migrate-test + reimplement | prompt repository tests, SQL migrations | CRUD, duplicate detection, lookup by id/name/digest. |
| First in-memory repository | reimplement | none | Enables fast unit tests before SQLite decision. |
| SQLite schema design note | reimplement | prompt migrations 001-006 | Decide whether to use SQLAlchemy or direct SQLite in later slice. |
## P2 - Relationships And Graphs
| Item | Strategy | Source | Acceptance direction |
| --- | --- | --- | --- |
| `Relationship` model with source, target, predicate, type, evidence, provenance | migrate-test + reimplement | `RelationMeta`, `DependencyEdge` | Relationship roundtrip and edge tuple tests. |
| Artifact dependency graph | migrate-test + reimplement | `DependencyGraph`, dependency tests | Successor/predecessor, cycle detection, topological sort. |
| Collection dependency and stale mapping model | reimplement | `composition.py`, discipline status/stale mapping | Can detect missing target relationships. |
| Graph query surface | reimplement | `EntityGraph`, query result envelope | Return nodes/edges with provenance, not Mermaid/DOT. |
| Graph export | adapter/defer | `graph_export.py`, `markitect-tool` docs | Defer visual export; engine returns data. |
## P3 - Workflow Runs And Manifests
| Item | Strategy | Source | Acceptance direction |
| --- | --- | --- | --- |
| `OperationRun` / `WorkflowRun` lifecycle | migrate-test + reimplement | `PromptRun`, `test_execution_models.py` | Stage transitions, success/fail/skip semantics. |
| `InputBundle` idempotency hash | migrate-test + reimplement | `InputBundle`, `RunManifestSchema.md` | Hash independent of dict ordering. |
| Run manifest model | migrate-test + reimplement | `RunManifestSchema.md`, migration 003 | Captures inputs, outputs, dependencies, validation, impact debt, timing. |
| Nested operation runs | migrate-test + reimplement | parent run/depth tests | Parent/child links and max-depth checks. |
| Workflow step adapter registry | adapter + reimplement | `markitect-tool` workflow standard | Steps call adapters; engine persists run state. |
| Prompt execution operation kind | adapter/defer | prompt execution engine | Defer provider execution until `llm-connect` adapter is explicit. |
## P4 - Ingestion And Normalization
| Item | Strategy | Source | Acceptance direction |
| --- | --- | --- | --- |
| Ingestion adapter protocol | reimplement | PRD/FRS FR-020/021 | Accepts bytes/text/path plus media type and returns normalized artifact(s). |
| Markdown ingestion adapter | adapter | `markitect_tool.core.parser`, backend snapshots | Calls `markitect-tool`; does not parse markdown internally. |
| Infospace fixture adapter | migrate-test + adapter | `examples/infospace-with-history` | Import a small fixture subset as collections/artifacts/relationships. |
| Multi-format placeholder interfaces | reimplement/defer | PRD/FRS | Design now; implement non-markdown later. |
## P5 - Query And Retrieval
| Item | Strategy | Source | Acceptance direction |
| --- | --- | --- | --- |
| Engine `QueryResult` envelope | migrate-test + reimplement | query paradigm `QueryResult` | Includes query, count, results, metadata, diagnostics. |
| Query by id/name/digest/metadata | reimplement | storage requirements | Deterministic repository-backed tests. |
| Query relationships and provenance | reimplement | graph/dependency tests | Finds neighbors, dependents, ancestors, producing runs. |
| Markdown selector passthrough | adapter | `markitect-tool` query/extract | Only available for markdown artifacts with source snapshots. |
| FTS/vector search | adapter/defer | `markitect-tool` local index, FTS plugin | Defer durable FTS until storage backend is chosen. |
## P6 - Quality, Assessment, And Impact
| Item | Strategy | Source | Acceptance direction |
| --- | --- | --- | --- |
| Assessment snapshot model | migrate-test + reimplement | `EvaluationSnapshot`, infospace tests | Attach scores/metrics to artifact, collection, or run. |
| Quality gate records | migrate-test + reimplement | quality models/tests, migration 006 | Persist pass/fail/skipped with diagnostics. |
| Halting policy model | migrate-test + reimplement | quality policy tests | Uses gate status, iteration limits, budgets, improvement. |
| Artifact change records | migrate-test + reimplement | `ArtifactChange`, migration 005 | Created/modified/deleted with old/new digest. |
| Impact debt records | migrate-test + reimplement | `ImpactDebt`, integration tests | Records suppressed recompute with magnitude and reason. |
| LLM assessed impact | adapter/defer | prompt FRS | Use `llm-connect` later; deterministic metrics first. |
## P7 - State Hub And Documentation Follow-Up
| Item | Strategy | Source | Acceptance direction |
| --- | --- | --- | --- |
| Update `KONT-WP-0003` from this backlog | reimplement | this doc | Ensure implementation workplan tasks line up with backlog order. |
| Add architectural ADR for storage/backend decision | defer | `docs/stack-decision.md` | Decide after in-memory contracts stabilize. |
| Add SBOM/lockfile | defer | foundation DoI | Generate once dependencies are installed with a lockfile. |
## Explicitly Out Of Scope
- Porting `markitect/core`, `markitect/schema`, explode/implode, transforms,
includes, document contracts, or selector internals. Use `markitect-tool`.
- Porting provider-specific `markitect/llm` code. Use `llm-connect`.
- Porting finance, issue tracker, profile, release, Gitea, browser, rendering,
or UI plugin code.
- Porting GraphQL as a default API. Use it only as historical evidence.

View File

@@ -4,11 +4,11 @@ type: workplan
title: "markitect-main System-Layer Extraction"
domain: markitect
repo: kontextual-engine
status: active
status: done
owner: codex
topic_slug: markitect
created: "2026-05-03"
updated: "2026-05-03"
updated: "2026-05-05"
state_hub_workstream_id: "e46d0962-7451-4b6c-b39f-461e35ba6a76"
---
@@ -36,7 +36,7 @@ Document the first-pass migration/reimplementation assessment in
```task
id: KONT-WP-0002-T002
status: todo
status: done
priority: high
state_hub_task_id: "86a1bf90-db72-44a0-a5ad-6374e6de8454"
```
@@ -45,11 +45,13 @@ Review legacy filesystem/SQLite repositories, workspace database docs, prompt
run migrations, and related tests. Classify each item as migrate test,
reimplement concept, defer, or out of scope.
Output: `docs/system-layer-extraction-inventory.md`.
## S2.3 - Inventory infospace models and relationships
```task
id: KONT-WP-0002-T003
status: todo
status: done
priority: high
state_hub_task_id: "8b88b3fa-a905-44aa-a25f-993cc9d50f2c"
```
@@ -58,11 +60,13 @@ Review `markitect/infospace/` models, relationship parsing, graph export, and
example fixtures. Extract generic artifact, collection, relationship, and
evaluation concepts without importing project-layer assumptions.
Output: `docs/system-layer-extraction-inventory.md`.
## S2.4 - Inventory orchestration and run-manifest material
```task
id: KONT-WP-0002-T004
status: todo
status: done
priority: high
state_hub_task_id: "1f15f603-4f86-41f8-8a24-95c0e9c825f7"
```
@@ -71,11 +75,14 @@ Review prompt dependency resolution roadmap, run manifests, quality tables,
batch processor behavior, and workflow-related migrations. Produce a candidate
workflow model for engine implementation.
Output: `docs/system-layer-extraction-inventory.md` and
`docs/system-layer-migration-backlog.md`.
## S2.5 - Inventory API and query experiments
```task
id: KONT-WP-0002-T005
status: todo
status: done
priority: medium
state_hub_task_id: "0a1e5a4b-f64d-4228-8f0f-e174475253da"
```
@@ -83,11 +90,14 @@ state_hub_task_id: "0a1e5a4b-f64d-4228-8f0f-e174475253da"
Review query paradigms, GraphQL docs, search/indexing experiments, and error
handling. Decide which API/query ideas deserve new tests or design notes.
Output: `docs/system-layer-extraction-inventory.md` and
`docs/markitect-tool-reuse-boundary.md`.
## S2.6 - Produce migration backlog
```task
id: KONT-WP-0002-T006
status: todo
status: done
priority: high
state_hub_task_id: "54a7e7a7-bf26-4f71-a8a5-9da48f5018c2"
```
@@ -95,3 +105,4 @@ state_hub_task_id: "54a7e7a7-bf26-4f71-a8a5-9da48f5018c2"
Create a structured backlog of candidate tests, fixtures, modules, and
behaviors for `KONT-WP-0003`, grouped by FRS section and migration strategy.
Output: `docs/system-layer-migration-backlog.md`.