first ingestion/normalization slice

This commit is contained in:
2026-05-06 02:35:40 +02:00
parent 286ebc3cb6
commit 565a5643a3
19 changed files with 1231 additions and 10 deletions

View File

@@ -4,13 +4,13 @@ type: workplan
title: "Multi-Format Ingestion And Normalization"
domain: markitect
repo: kontextual-engine
status: todo
status: active
owner: codex
topic_slug: markitect
planning_priority: high
planning_order: 6
created: "2026-05-05"
updated: "2026-05-05"
updated: "2026-05-06"
state_hub_workstream_id: "270c83c0-eaed-4143-99d0-bb3fcfd23758"
---
@@ -45,11 +45,21 @@ needed, and snapshot identity. The engine should normalize Markitect results
into its common representation and preserve source/adapter provenance rather
than rebuilding Markdown syntax behavior.
## Implementation Status
As of 2026-05-06, the first ingestion slice is recorded in
`docs/ingestion-implementation.md`. It establishes ingestion job primitives,
connector/extractor ports, local file ingestion, plain text normalization,
Markitect markdown adapter boundaries, directory partial-result reporting, and
in-memory/SQLite job persistence. Remaining work is focused on async execution,
re-ingestion identity reconciliation, richer structural extraction, quarantine
policy checks, and non-text format adapters.
## I6.1 - Implement ingestion job model status and retry surface
```task
id: KONT-WP-0006-T001
status: todo
status: done
priority: high
state_hub_task_id: "8e5e514a-6eef-42d9-a93c-2458b4c82753"
```
@@ -68,7 +78,7 @@ Acceptance:
```task
id: KONT-WP-0006-T002
status: todo
status: done
priority: high
state_hub_task_id: "3eafdab5-478d-49d9-a17f-3cd7c8847cb1"
```
@@ -87,7 +97,7 @@ Acceptance:
```task
id: KONT-WP-0006-T003
status: todo
status: in_progress
priority: high
state_hub_task_id: "d3e3d4d2-a581-4438-bee7-6fc4161d3925"
```
@@ -105,7 +115,7 @@ Acceptance:
```task
id: KONT-WP-0006-T004
status: todo
status: in_progress
priority: high
state_hub_task_id: "63bf2f7e-705d-40ae-a160-75fc508ffb1f"
```