From d3a77a6eef3d50505818574adadfc3d92806cbf9 Mon Sep 17 00:00:00 2001 From: tegwick Date: Mon, 4 May 2026 10:49:07 +0200 Subject: [PATCH] Workplan extensible canonical processing model --- docs/workplan-planning-map.md | 12 + ...TT-WP-0013-internal-extension-framework.md | 288 ++++++++++++++++++ 2 files changed, 300 insertions(+) create mode 100644 workplans/MKTT-WP-0013-internal-extension-framework.md diff --git a/docs/workplan-planning-map.md b/docs/workplan-planning-map.md index b6d25f5..33c5e6d 100644 --- a/docs/workplan-planning-map.md +++ b/docs/workplan-planning-map.md @@ -34,6 +34,7 @@ and descriptions mirror the operational view. | `MKTT-WP-0006` | complete | done | `MKTT-WP-0004`; task-level trigger: `MKTT-WP-0003-T005` | Optional backend fabric is complete: manifests, capabilities, snapshot identity, interfaces, registry, provenance, and read-only CLI scaffolding. | | `MKTT-WP-0010` | complete | done | `MKTT-WP-0004`; task-level trigger: `MKTT-WP-0003-T006` | Content references, processors, explode/implode, weave/tangle, content classes, and migration examples are complete as the first WP-0010 extension layer. | | `MKTT-WP-0007` | complete | done | `MKTT-WP-0006` | Advanced query and local index backend is complete: AST inspection, optional JSONPath, SQLite snapshots/metadata, FTS5 search, incremental refresh, and local index CLI. | +| `MKTT-WP-0013` | P1 | todo | `MKTT-WP-0003`, `MKTT-WP-0004`, `MKTT-WP-0006`, `MKTT-WP-0007`, `MKTT-WP-0010` | Internal extension framework and canonical processing model: characterize current behavior, add registries/descriptors/callbacks, and reduce central wiring before heavier runtime/workflow work. | | `MKTT-WP-0005` | P2 | todo | `MKTT-WP-0003`, `MKTT-WP-0004` | Pick up when generation/form/context or semantic assessment pressure appears. | | `MKTT-WP-0011` | P2 | todo | `MKTT-WP-0003`; task-level triggers: `MKTT-WP-0010-T001`, `MKTT-WP-0010-T005` | Declarative Markdown dataflow workflows: source extraction, deterministic/assisted processing, and multi-output generation. | | `MKTT-WP-0009` | P2 | todo | `MKTT-WP-0006` | Establish access-control gateway before security-sensitive cache/context use. | @@ -61,6 +62,12 @@ optional assisted generation hooks, and multiple Markdown outputs. It should not block P3.7, but it should follow the first reference model and processor registry decisions in `MKTT-WP-0010`. +`MKTT-WP-0013` captures internal extensibility pressure found while adding +optional query, backend, processor, and index features. It should precede major +runtime/workflow expansion because it reduces central wiring and gives future +features a canonical processing context/result/diagnostic/provenance model. It +is not a business dataflow layer; that remains `MKTT-WP-0011`. + `MKTT-WP-0012` captures the Quarkdown-inspired document function layer. It should follow `MKTT-WP-0011` because the workflow layer will reveal which operations deserve author-facing function syntax. It should remain optional and @@ -85,6 +92,11 @@ dependencies: - `MKTT-WP-0005 -> MKTT-WP-0004` - `MKTT-WP-0011 -> MKTT-WP-0003` - `MKTT-WP-0009 -> MKTT-WP-0006` +- `MKTT-WP-0013 -> MKTT-WP-0003` +- `MKTT-WP-0013 -> MKTT-WP-0004` +- `MKTT-WP-0013 -> MKTT-WP-0006` +- `MKTT-WP-0013 -> MKTT-WP-0007` +- `MKTT-WP-0013 -> MKTT-WP-0010` - `MKTT-WP-0012 -> MKTT-WP-0004` - `MKTT-WP-0012 -> MKTT-WP-0010` - `MKTT-WP-0012 -> MKTT-WP-0011` diff --git a/workplans/MKTT-WP-0013-internal-extension-framework.md b/workplans/MKTT-WP-0013-internal-extension-framework.md new file mode 100644 index 0000000..a115483 --- /dev/null +++ b/workplans/MKTT-WP-0013-internal-extension-framework.md @@ -0,0 +1,288 @@ +--- +id: MKTT-WP-0013 +type: workplan +title: "Internal Extension Framework and Canonical Processing Model" +domain: markitect +status: todo +owner: markitect-tool +topic_slug: markitect +planning_priority: P1 +planning_order: 65 +depends_on_workplans: + - MKTT-WP-0003 + - MKTT-WP-0004 + - MKTT-WP-0006 + - MKTT-WP-0007 + - MKTT-WP-0010 +related_workplans: + - MKTT-WP-0005 + - MKTT-WP-0009 + - MKTT-WP-0011 + - MKTT-WP-0012 +created: "2026-05-04" +updated: "2026-05-04" +state_hub_workstream_id: "5eea103f-f584-4360-b7e3-c5b09a4814bd" +--- + +# MKTT-WP-0013: Internal Extension Framework and Canonical Processing Model + +## Purpose + +Create an internal extension framework that lets optional Markitect features +register well-contained implementations, descriptors, callbacks, diagnostics, +capabilities, and CLI/query integration points without repeatedly expanding +central modules. + +This workplan is about internal extensibility and framework shape. It is +distinct from `MKTT-WP-0011`, which organizes business-facing dataflow pipelines. + +## Background + +Recent implementation work added valuable optional functionality: + +- processor registry and deterministic fenced-block processors +- backend manifests and local SQLite backend +- selector and optional JSONPath query engines +- FTS search over indexed sections and blocks +- content references, literate workflows, explode/implode, and content classes + +The functionality is working, but extension pressure is visible. Optional +features still tend to require edits in central files such as CLI wiring, query +exports, backend exports, and shared command dispatch. That is acceptable early +in a small toolkit, but it becomes a maintenance liability if Markitect is meant +to grow into a research lab for sophisticated Markdown/knowledge systems. + +The target architecture should preserve the current slim core while making +extensions feel first-class: + +```text +specification file + implementation module + registration descriptor + -> extension registry + -> canonical processing request/context/result + -> callbacks, diagnostics, provenance, capabilities + -> CLI/API/query/backend integration +``` + +## Decision + +Yes, restructure, but do it deliberately: + +1. Add characterization tests for the current behaviors before refactoring. +2. Define a canonical processing model that extensions can share. +3. Introduce extension descriptors and registries with minimal central wiring. +4. Migrate one vertical slice at a time. +5. Keep compatibility aliases and existing CLI commands stable. + +Avoid a plugin system that is more elaborate than the project needs. The first +version should support internal extension isolation and later package-level +discovery without forcing dynamic loading or external dependency installation. + +## P13.1 - Architecture note and extension taxonomy + +```task +id: MKTT-WP-0013-T001 +status: todo +priority: high +state_hub_task_id: "ba106001-c953-435a-8012-0dd83533d309" +``` + +Define the internal extension taxonomy: + +- query engines +- processors +- backends and index stores +- references and content-unit providers +- validators and contract checks +- templates/generation adapters +- CLI command groups +- future render/export adapters +- future document functions + +Output: architecture note explaining extension boundaries, lifecycle, +registration semantics, and relationship to `MKTT-WP-0011`. + +## P13.2 - Add characterization tests before refactor + +```task +id: MKTT-WP-0013-T002 +status: todo +priority: high +state_hub_task_id: "a270cb7a-4dbf-4562-b0ab-d5dda5124086" +``` + +Lock down current behavior before moving code behind registries: + +- selector query and extraction +- optional JSONPath diagnostics +- processor registry behavior +- backend manifest registry +- local SQLite snapshot/index/search behavior +- content reference resolution +- key CLI commands and output envelopes +- provenance and diagnostics shapes + +Output: focused characterization tests that can fail loudly if refactoring +changes public behavior. + +## P13.3 - Define canonical processing model + +```task +id: MKTT-WP-0013-T003 +status: todo +priority: high +state_hub_task_id: "8c88b9a7-1e8d-401c-ad09-8b5a19ccba14" +``` + +Create shared framework types for extension execution: + +- `ProcessingRequest` +- `ProcessingContext` +- `ProcessingResult` +- `ProcessingDiagnostic` +- `ProcessingCapability` +- `ProcessingProvenance` +- optional `ProcessingTrace` + +The model should support deterministic, assisted, external, and read-only +operations without making every extension depend on every subsystem. + +Output: framework module, tests, and migration guide for current subsystems. + +## P13.4 - Implement extension descriptors and registries + +```task +id: MKTT-WP-0013-T004 +status: todo +priority: high +state_hub_task_id: "3fb2fe81-9819-4679-99d0-ad60ac9e8277" +``` + +Define descriptor objects for extensions: + +- stable id +- kind +- version +- implementation reference +- capabilities +- optional dependencies +- safety/policy flags +- input and output contracts +- CLI/API affordances +- docs/examples links + +Implement registries that can be assembled from in-package extension modules +and, later, package entry points. + +Output: descriptor schema, registry API, duplicate/missing dependency +diagnostics, and tests. + +## P13.5 - Add callback hooks and execution lifecycle + +```task +id: MKTT-WP-0013-T005 +status: todo +priority: medium +state_hub_task_id: "be8f2056-f413-44f9-be9c-6046c34e307e" +``` + +Add lifecycle callbacks for: + +- before execution +- after success +- after diagnostic failure +- provenance capture +- cache key calculation +- capability/policy checks +- trace/event emission + +Callbacks must be explicit and deterministic by default. They should not become +hidden global behavior. + +Output: callback model and tests with fake extensions. + +## P13.6 - Refactor query engines behind registry + +```task +id: MKTT-WP-0013-T006 +status: todo +priority: high +state_hub_task_id: "0226c1d1-f583-43ad-8e20-f75f9790e17d" +``` + +Move selector and JSONPath engines behind a query-engine registry while +preserving `query_document`, `extract_document`, `mkt query`, and `mkt extract` +compatibility. + +Output: registered selector/jsonpath engines, compatibility shims, and tests. + +## P13.7 - Refactor processors and local backend as registered extensions + +```task +id: MKTT-WP-0013-T007 +status: todo +priority: medium +state_hub_task_id: "a966dcbb-3ae8-47bf-85c8-4ba6ddcf7a31" +``` + +Adapt existing processor and backend infrastructure to expose descriptors and +registry entries without changing their external behavior. + +Focus areas: + +- deterministic fenced processors +- local SQLite index backend +- backend manifests +- FTS search +- snapshot refresh planning + +Output: extension-backed processor/backend registration and regression tests. + +## P13.8 - Refactor CLI composition to reduce central wiring + +```task +id: MKTT-WP-0013-T008 +status: todo +priority: medium +state_hub_task_id: "3e88ca62-8dba-4632-b5d0-29827d102322" +``` + +Reduce direct growth pressure in `cli/main.py` by allowing extension modules to +register command groups or command specs through a small, testable integration +point. + +Output: CLI extension hook, migrated command group examples, and unchanged +public CLI behavior. + +## P13.9 - Document extension authoring conventions + +```task +id: MKTT-WP-0013-T009 +status: todo +priority: medium +state_hub_task_id: "848e2a5e-c32b-4a94-906b-dc6aced4c71b" +``` + +Document how a new internal extension should be structured: + +- specification file +- implementation module +- registration descriptor +- tests +- docs/examples +- diagnostics and provenance expectations +- optional dependency handling +- policy/capability declarations + +Output: extension authoring guide and one small template/example extension. + +## Exit Criteria + +- Existing behavior is covered by characterization tests before refactoring. +- Optional features can live in well-contained modules with descriptors. +- Central CLI/query/backend files stop being the primary integration surface for + every new feature. +- The canonical processing model provides shared context/result/diagnostic/ + provenance semantics without overfitting to pipelines. +- The framework is clearly distinct from business-facing workflow orchestration. +- Existing public commands and library APIs remain compatible or have explicit + compatibility shims.