Complete system-layer extraction plan

2026-05-05 01:17:48 +02:00
parent 67010a0429
commit 902ba7352d
8 changed files with 345 additions and 21 deletions
--- a/docs/system-layer-extraction-inventory.md
+++ b/docs/system-layer-extraction-inventory.md
@@ -0,0 +1,166 @@
+# System-Layer Extraction Inventory
+
+Date: 2026-05-05
+
+Sources reviewed:
+
+- `/home/worsch/markitect-main/infrastructure/`
+- `/home/worsch/markitect-main/markitect/infospace/`
+- `/home/worsch/markitect-main/markitect/prompts/`
+- `/home/worsch/markitect-main/migrations/prompts/`
+- `/home/worsch/markitect-main/roadmap/prompt-dependency-resolution/`
+- `/home/worsch/markitect-main/markitect/query_paradigms/`
+- `/home/worsch/markitect-main/markitect/plugins/builtin/search/`
+- `/home/worsch/markitect-tool/`
+
+## Executive Classification
+
+| Area | Legacy evidence | Destination |
+| --- | --- | --- |
+| Addressable artifacts and digests | `markitect/prompts/models.py`, `migrations/prompts/001_create_artifacts_table.sql`, prompt tests | Reimplement in `kontextual-engine` as core artifact model. |
+| Artifact storage/repositories | `markitect/prompts/repositories/`, `infrastructure/repositories/sqlite_repository.py` | Reimplement storage interface; use old code as behavior reference only. |
+| Resolution across spaces | `markitect/prompts/resolver/`, `002_create_resolution_config.sql` | Reimplement as collection/context resolution policy. |
+| Runs and manifests | `markitect/prompts/execution/`, `003_create_runs_and_manifests.sql`, `RunManifestSchema.md` | Reimplement as workflow run model. |
+| Dependency graph | `markitect/prompts/dependencies/`, `004_create_dependencies.sql` | Reimplement with artifact relationship graph semantics. |
+| Incremental recompute and impact debt | `markitect/prompts/incremental/`, `005_create_changes_and_debt.sql` | Reimplement after artifact/run persistence exists. |
+| Quality gates and halting | `markitect/prompts/quality/`, `006_create_quality_tables.sql` | Reimplement as provider-neutral validation policy; delegate markdown validation to `markitect-tool`. |
+| Infospace entities and relationships | `markitect/infospace/models.py`, `relation_models.py`, `graph_export.py`, `state.py`, `evaluation.py` | Extract vocabulary and tests; generalize beyond markdown/project directories. |
+| Source pipelines | `markitect/infospace/pipeline.py` | Reimplement as engine workflow concepts; delegate markdown/template operations to adapters. |
+| Query paradigms and search | `markitect/query_paradigms/`, FTS plugin | Use as design evidence; reuse `markitect-tool` query/index APIs instead. |
+| Production/config/logging utilities | `markitect/production/`, `infrastructure/logging/` | Mostly out of scope; keep structured error/audit ideas. |
+
+## Persistence And Repository Findings
+
+Relevant legacy concepts:
+
+- Artifact identity: UUID, `space_id`, name, type, content digest, content
+  size, metadata, timestamps.
+- Content addressing: SHA-256 digest used for change detection and
+  idempotency.
+- Repository behavior: create/read/update/delete, duplicate detection,
+  lookup by name or digest, pagination, structured errors.
+- SQLite tables: `prompt_artifacts`, `prompt_resolution_config`,
+  `prompt_runs`, `run_manifests`, `prompt_dependencies`,
+  `artifact_changes`, `impact_debt`, `quality_gates`,
+  `validation_results`.
+
+Extraction decision:
+
+- Reimplement artifact persistence around engine-owned `Artifact`,
+  `Collection`, `ArtifactVersion`, `Relationship`, and `OperationRun`.
+- Do not reuse legacy repository classes directly; they mix issue/project
+  repositories, old workspace assumptions, and Gitea-specific error paths.
+- Preserve tests for digest determinism, artifact reference parsing,
+  duplicate handling, idempotency hashes, and dependency graph operations.
+
+## Infospace Model And Relationship Findings
+
+Relevant legacy concepts:
+
+- `EntityMeta`: slug, title, typed section contents, derived metrics, source
+  path, section slugs.
+- `RelationMeta`: subject, predicate, object, relation type, evidence, source
+  path, edge tuple.
+- `InfospaceState`: aggregate state with entity count, domains, latest
+  evaluation snapshot, viability checks.
+- `EvaluationSnapshot`: per-entity scores, collection metrics, score/metric
+  diffs.
+- `EntityGraph`: nodes, multiedges, feedback loop membership, filters, export.
+- Discipline composition: reusable conceptual collection, path resolution,
+  viability, stale mapping detection.
+
+Extraction decision:
+
+- Convert `EntityMeta` into a generic artifact facet, not an engine-only entity
+  type.
+- Convert `RelationMeta` into first-class relationship records with typed
+  predicates, provenance, and evidence.
+- Keep evaluation snapshots as generic assessment snapshots attached to
+  artifacts, collections, or workflow runs.
+- Keep discipline composition as collection dependency and mapping freshness,
+  without assuming markdown directories.
+- Use `markitect-tool` for parsing markdown entities and extracting sections.
+
+## Orchestration And Run Manifest Findings
+
+Relevant legacy concepts:
+
+- Prompt run stages: template analysis, context compilation, prompt processing.
+- Input bundle hash: template digest, ordered dependency digests, resolver
+  config, model settings, compilation options.
+- Nested generator runs with `parent_run_id` and depth limits.
+- Run manifest: resolved inputs, compiled prompt digest, model config,
+  outputs, dependency edges, validation results, impact debt, timing metadata.
+- Recompute defaults: depth 1, circular suppression, budget limits, impact
+  thresholds.
+- Traceability: produced artifacts link back to templates, inputs, runs, and
+  dependency edges.
+
+Extraction decision:
+
+- Generalize `PromptRun` into `OperationRun` / `WorkflowRun`, where prompt
+  execution is one operation kind.
+- Keep the input bundle hash concept as a universal idempotency key for
+  deterministic and assisted operations.
+- Keep run manifests as durable, inspectable records. Prefer a Pydantic model
+  before any database schema.
+- Do not embed LLM provider execution; route provider calls through
+  `llm-connect` adapters.
+
+## API, Query, And Retrieval Findings
+
+Relevant legacy concepts:
+
+- `QueryResult` standard envelope: paradigm, query, timing, count, results,
+  metadata, success/error.
+- Query registry: discoverable paradigms by name/category/complexity.
+- SQLite FTS indexer: FTS5 tables, rebuild, optimize, availability checks.
+- GraphQL/REST paradigm files exist but are better evidence of integration
+  desires than a target API contract.
+
+Extraction decision:
+
+- Engine query should return stable result envelopes with metadata,
+  provenance, and structured errors.
+- Query should operate over artifact ids, metadata, relationships, content
+  references, run records, and assessment snapshots.
+- Reuse `markitect-tool` selector/query and local index APIs for markdown
+  content. Do not port the old query paradigm registry wholesale.
+
+## Candidate Test Sources
+
+High-value legacy tests to mine:
+
+- `tests/unit/prompts/test_artifact_models.py`
+- `tests/unit/prompts/test_artifact_repository.py`
+- `tests/unit/prompts/test_template_models.py`
+- `tests/unit/prompts/test_macro_parser.py`
+- `tests/unit/prompts/test_resolution_strategy.py`
+- `tests/unit/prompts/test_context_compiler.py`
+- `tests/unit/prompts/test_execution_models.py`
+- `tests/unit/prompts/test_execution_engine.py`
+- `tests/unit/prompts/test_dependency_models.py`
+- `tests/unit/prompts/test_dependency_repository.py`
+- `tests/unit/prompts/test_incremental_engine.py`
+- `tests/unit/prompts/test_change_detector.py`
+- `tests/unit/prompts/test_impact_analyzer.py`
+- `tests/unit/prompts/test_quality_gates.py`
+- `tests/unit/prompts/test_halting_policy.py`
+- `tests/unit/prompts/test_traceability_service.py`
+- `tests/integration/prompts/test_dependency_graph.py`
+- `tests/integration/prompts/test_incremental_recompute.py`
+- `tests/integration/prompts/test_quality_validation.py`
+- `tests/integration/prompts/test_traceability_workflow.py`
+- `tests/unit/infospace/test_entity_parser.py`
+- `tests/unit/infospace/test_evaluation.py`
+- `tests/unit/infospace/test_composition.py`
+- `tests/unit/infospace/test_checks.py`
+- `markitect/query_paradigms/tests/test_query_paradigms.py`
+
+Test migration posture:
+
+- Migrate expected behavior, not imports.
+- Rewrite around `kontextual_engine.*` contracts.
+- Keep markdown parsing fixtures behind `markitect-tool` adapters.
+- Treat prompt tests as the first source for `KONT-WP-0003` unit tests.
+