Files
kontextual-engine/docs/system-layer-extraction-inventory.md

167 lines
8.0 KiB
Markdown

# System-Layer Extraction Inventory
Date: 2026-05-05
Sources reviewed:
- `/home/worsch/markitect-main/infrastructure/`
- `/home/worsch/markitect-main/markitect/infospace/`
- `/home/worsch/markitect-main/markitect/prompts/`
- `/home/worsch/markitect-main/migrations/prompts/`
- `/home/worsch/markitect-main/roadmap/prompt-dependency-resolution/`
- `/home/worsch/markitect-main/markitect/query_paradigms/`
- `/home/worsch/markitect-main/markitect/plugins/builtin/search/`
- `/home/worsch/markitect-tool/`
## Executive Classification
| Area | Legacy evidence | Destination |
| --- | --- | --- |
| Addressable artifacts and digests | `markitect/prompts/models.py`, `migrations/prompts/001_create_artifacts_table.sql`, prompt tests | Reimplement in `kontextual-engine` as core artifact model. |
| Artifact storage/repositories | `markitect/prompts/repositories/`, `infrastructure/repositories/sqlite_repository.py` | Reimplement storage interface; use old code as behavior reference only. |
| Resolution across spaces | `markitect/prompts/resolver/`, `002_create_resolution_config.sql` | Reimplement as collection/context resolution policy. |
| Runs and manifests | `markitect/prompts/execution/`, `003_create_runs_and_manifests.sql`, `RunManifestSchema.md` | Reimplement as workflow run model. |
| Dependency graph | `markitect/prompts/dependencies/`, `004_create_dependencies.sql` | Reimplement with artifact relationship graph semantics. |
| Incremental recompute and impact debt | `markitect/prompts/incremental/`, `005_create_changes_and_debt.sql` | Reimplement after artifact/run persistence exists. |
| Quality gates and halting | `markitect/prompts/quality/`, `006_create_quality_tables.sql` | Reimplement as provider-neutral validation policy; delegate markdown validation to `markitect-tool`. |
| Infospace entities and relationships | `markitect/infospace/models.py`, `relation_models.py`, `graph_export.py`, `state.py`, `evaluation.py` | Extract vocabulary and tests; generalize beyond markdown/project directories. |
| Source pipelines | `markitect/infospace/pipeline.py` | Reimplement as engine workflow concepts; delegate markdown/template operations to adapters. |
| Query paradigms and search | `markitect/query_paradigms/`, FTS plugin | Use as design evidence; reuse `markitect-tool` query/index APIs instead. |
| Production/config/logging utilities | `markitect/production/`, `infrastructure/logging/` | Mostly out of scope; keep structured error/audit ideas. |
## Persistence And Repository Findings
Relevant legacy concepts:
- Artifact identity: UUID, `space_id`, name, type, content digest, content
size, metadata, timestamps.
- Content addressing: SHA-256 digest used for change detection and
idempotency.
- Repository behavior: create/read/update/delete, duplicate detection,
lookup by name or digest, pagination, structured errors.
- SQLite tables: `prompt_artifacts`, `prompt_resolution_config`,
`prompt_runs`, `run_manifests`, `prompt_dependencies`,
`artifact_changes`, `impact_debt`, `quality_gates`,
`validation_results`.
Extraction decision:
- Reimplement artifact persistence around engine-owned `Artifact`,
`Collection`, `ArtifactVersion`, `Relationship`, and `OperationRun`.
- Do not reuse legacy repository classes directly; they mix issue/project
repositories, old workspace assumptions, and Gitea-specific error paths.
- Preserve tests for digest determinism, artifact reference parsing,
duplicate handling, idempotency hashes, and dependency graph operations.
## Infospace Model And Relationship Findings
Relevant legacy concepts:
- `EntityMeta`: slug, title, typed section contents, derived metrics, source
path, section slugs.
- `RelationMeta`: subject, predicate, object, relation type, evidence, source
path, edge tuple.
- `InfospaceState`: aggregate state with entity count, domains, latest
evaluation snapshot, viability checks.
- `EvaluationSnapshot`: per-entity scores, collection metrics, score/metric
diffs.
- `EntityGraph`: nodes, multiedges, feedback loop membership, filters, export.
- Discipline composition: reusable conceptual collection, path resolution,
viability, stale mapping detection.
Extraction decision:
- Convert `EntityMeta` into a generic artifact facet, not an engine-only entity
type.
- Convert `RelationMeta` into first-class relationship records with typed
predicates, provenance, and evidence.
- Keep evaluation snapshots as generic assessment snapshots attached to
artifacts, collections, or workflow runs.
- Keep discipline composition as collection dependency and mapping freshness,
without assuming markdown directories.
- Use `markitect-tool` for parsing markdown entities and extracting sections.
## Orchestration And Run Manifest Findings
Relevant legacy concepts:
- Prompt run stages: template analysis, context compilation, prompt processing.
- Input bundle hash: template digest, ordered dependency digests, resolver
config, model settings, compilation options.
- Nested generator runs with `parent_run_id` and depth limits.
- Run manifest: resolved inputs, compiled prompt digest, model config,
outputs, dependency edges, validation results, impact debt, timing metadata.
- Recompute defaults: depth 1, circular suppression, budget limits, impact
thresholds.
- Traceability: produced artifacts link back to templates, inputs, runs, and
dependency edges.
Extraction decision:
- Generalize `PromptRun` into `OperationRun` / `WorkflowRun`, where prompt
execution is one operation kind.
- Keep the input bundle hash concept as a universal idempotency key for
deterministic and assisted operations.
- Keep run manifests as durable, inspectable records. Prefer a Pydantic model
before any database schema.
- Do not embed LLM provider execution; route provider calls through
`llm-connect` adapters.
## API, Query, And Retrieval Findings
Relevant legacy concepts:
- `QueryResult` standard envelope: paradigm, query, timing, count, results,
metadata, success/error.
- Query registry: discoverable paradigms by name/category/complexity.
- SQLite FTS indexer: FTS5 tables, rebuild, optimize, availability checks.
- GraphQL/REST paradigm files exist but are better evidence of integration
desires than a target API contract.
Extraction decision:
- Engine query should return stable result envelopes with metadata,
provenance, and structured errors.
- Query should operate over artifact ids, metadata, relationships, content
references, run records, and assessment snapshots.
- Reuse `markitect-tool` selector/query and local index APIs for markdown
content. Do not port the old query paradigm registry wholesale.
## Candidate Test Sources
High-value legacy tests to mine:
- `tests/unit/prompts/test_artifact_models.py`
- `tests/unit/prompts/test_artifact_repository.py`
- `tests/unit/prompts/test_template_models.py`
- `tests/unit/prompts/test_macro_parser.py`
- `tests/unit/prompts/test_resolution_strategy.py`
- `tests/unit/prompts/test_context_compiler.py`
- `tests/unit/prompts/test_execution_models.py`
- `tests/unit/prompts/test_execution_engine.py`
- `tests/unit/prompts/test_dependency_models.py`
- `tests/unit/prompts/test_dependency_repository.py`
- `tests/unit/prompts/test_incremental_engine.py`
- `tests/unit/prompts/test_change_detector.py`
- `tests/unit/prompts/test_impact_analyzer.py`
- `tests/unit/prompts/test_quality_gates.py`
- `tests/unit/prompts/test_halting_policy.py`
- `tests/unit/prompts/test_traceability_service.py`
- `tests/integration/prompts/test_dependency_graph.py`
- `tests/integration/prompts/test_incremental_recompute.py`
- `tests/integration/prompts/test_quality_validation.py`
- `tests/integration/prompts/test_traceability_workflow.py`
- `tests/unit/infospace/test_entity_parser.py`
- `tests/unit/infospace/test_evaluation.py`
- `tests/unit/infospace/test_composition.py`
- `tests/unit/infospace/test_checks.py`
- `markitect/query_paradigms/tests/test_query_paradigms.py`
Test migration posture:
- Migrate expected behavior, not imports.
- Rewrite around `kontextual_engine.*` contracts.
- Keep markdown parsing fixtures behind `markitect-tool` adapters.
- Treat prompt tests as the first source for `KONT-WP-0003` unit tests.