Workplan extensible canonical processing model

This commit is contained in:
2026-05-04 10:49:07 +02:00
parent 0015c8a385
commit d3a77a6eef
2 changed files with 300 additions and 0 deletions

View File

@@ -34,6 +34,7 @@ and descriptions mirror the operational view.
| `MKTT-WP-0006` | complete | done | `MKTT-WP-0004`; task-level trigger: `MKTT-WP-0003-T005` | Optional backend fabric is complete: manifests, capabilities, snapshot identity, interfaces, registry, provenance, and read-only CLI scaffolding. |
| `MKTT-WP-0010` | complete | done | `MKTT-WP-0004`; task-level trigger: `MKTT-WP-0003-T006` | Content references, processors, explode/implode, weave/tangle, content classes, and migration examples are complete as the first WP-0010 extension layer. |
| `MKTT-WP-0007` | complete | done | `MKTT-WP-0006` | Advanced query and local index backend is complete: AST inspection, optional JSONPath, SQLite snapshots/metadata, FTS5 search, incremental refresh, and local index CLI. |
| `MKTT-WP-0013` | P1 | todo | `MKTT-WP-0003`, `MKTT-WP-0004`, `MKTT-WP-0006`, `MKTT-WP-0007`, `MKTT-WP-0010` | Internal extension framework and canonical processing model: characterize current behavior, add registries/descriptors/callbacks, and reduce central wiring before heavier runtime/workflow work. |
| `MKTT-WP-0005` | P2 | todo | `MKTT-WP-0003`, `MKTT-WP-0004` | Pick up when generation/form/context or semantic assessment pressure appears. |
| `MKTT-WP-0011` | P2 | todo | `MKTT-WP-0003`; task-level triggers: `MKTT-WP-0010-T001`, `MKTT-WP-0010-T005` | Declarative Markdown dataflow workflows: source extraction, deterministic/assisted processing, and multi-output generation. |
| `MKTT-WP-0009` | P2 | todo | `MKTT-WP-0006` | Establish access-control gateway before security-sensitive cache/context use. |
@@ -61,6 +62,12 @@ optional assisted generation hooks, and multiple Markdown outputs. It should not
block P3.7, but it should follow the first reference model and processor
registry decisions in `MKTT-WP-0010`.
`MKTT-WP-0013` captures internal extensibility pressure found while adding
optional query, backend, processor, and index features. It should precede major
runtime/workflow expansion because it reduces central wiring and gives future
features a canonical processing context/result/diagnostic/provenance model. It
is not a business dataflow layer; that remains `MKTT-WP-0011`.
`MKTT-WP-0012` captures the Quarkdown-inspired document function layer. It
should follow `MKTT-WP-0011` because the workflow layer will reveal which
operations deserve author-facing function syntax. It should remain optional and
@@ -85,6 +92,11 @@ dependencies:
- `MKTT-WP-0005 -> MKTT-WP-0004`
- `MKTT-WP-0011 -> MKTT-WP-0003`
- `MKTT-WP-0009 -> MKTT-WP-0006`
- `MKTT-WP-0013 -> MKTT-WP-0003`
- `MKTT-WP-0013 -> MKTT-WP-0004`
- `MKTT-WP-0013 -> MKTT-WP-0006`
- `MKTT-WP-0013 -> MKTT-WP-0007`
- `MKTT-WP-0013 -> MKTT-WP-0010`
- `MKTT-WP-0012 -> MKTT-WP-0004`
- `MKTT-WP-0012 -> MKTT-WP-0010`
- `MKTT-WP-0012 -> MKTT-WP-0011`

View File

@@ -0,0 +1,288 @@
---
id: MKTT-WP-0013
type: workplan
title: "Internal Extension Framework and Canonical Processing Model"
domain: markitect
status: todo
owner: markitect-tool
topic_slug: markitect
planning_priority: P1
planning_order: 65
depends_on_workplans:
- MKTT-WP-0003
- MKTT-WP-0004
- MKTT-WP-0006
- MKTT-WP-0007
- MKTT-WP-0010
related_workplans:
- MKTT-WP-0005
- MKTT-WP-0009
- MKTT-WP-0011
- MKTT-WP-0012
created: "2026-05-04"
updated: "2026-05-04"
state_hub_workstream_id: "5eea103f-f584-4360-b7e3-c5b09a4814bd"
---
# MKTT-WP-0013: Internal Extension Framework and Canonical Processing Model
## Purpose
Create an internal extension framework that lets optional Markitect features
register well-contained implementations, descriptors, callbacks, diagnostics,
capabilities, and CLI/query integration points without repeatedly expanding
central modules.
This workplan is about internal extensibility and framework shape. It is
distinct from `MKTT-WP-0011`, which organizes business-facing dataflow pipelines.
## Background
Recent implementation work added valuable optional functionality:
- processor registry and deterministic fenced-block processors
- backend manifests and local SQLite backend
- selector and optional JSONPath query engines
- FTS search over indexed sections and blocks
- content references, literate workflows, explode/implode, and content classes
The functionality is working, but extension pressure is visible. Optional
features still tend to require edits in central files such as CLI wiring, query
exports, backend exports, and shared command dispatch. That is acceptable early
in a small toolkit, but it becomes a maintenance liability if Markitect is meant
to grow into a research lab for sophisticated Markdown/knowledge systems.
The target architecture should preserve the current slim core while making
extensions feel first-class:
```text
specification file + implementation module + registration descriptor
-> extension registry
-> canonical processing request/context/result
-> callbacks, diagnostics, provenance, capabilities
-> CLI/API/query/backend integration
```
## Decision
Yes, restructure, but do it deliberately:
1. Add characterization tests for the current behaviors before refactoring.
2. Define a canonical processing model that extensions can share.
3. Introduce extension descriptors and registries with minimal central wiring.
4. Migrate one vertical slice at a time.
5. Keep compatibility aliases and existing CLI commands stable.
Avoid a plugin system that is more elaborate than the project needs. The first
version should support internal extension isolation and later package-level
discovery without forcing dynamic loading or external dependency installation.
## P13.1 - Architecture note and extension taxonomy
```task
id: MKTT-WP-0013-T001
status: todo
priority: high
state_hub_task_id: "ba106001-c953-435a-8012-0dd83533d309"
```
Define the internal extension taxonomy:
- query engines
- processors
- backends and index stores
- references and content-unit providers
- validators and contract checks
- templates/generation adapters
- CLI command groups
- future render/export adapters
- future document functions
Output: architecture note explaining extension boundaries, lifecycle,
registration semantics, and relationship to `MKTT-WP-0011`.
## P13.2 - Add characterization tests before refactor
```task
id: MKTT-WP-0013-T002
status: todo
priority: high
state_hub_task_id: "a270cb7a-4dbf-4562-b0ab-d5dda5124086"
```
Lock down current behavior before moving code behind registries:
- selector query and extraction
- optional JSONPath diagnostics
- processor registry behavior
- backend manifest registry
- local SQLite snapshot/index/search behavior
- content reference resolution
- key CLI commands and output envelopes
- provenance and diagnostics shapes
Output: focused characterization tests that can fail loudly if refactoring
changes public behavior.
## P13.3 - Define canonical processing model
```task
id: MKTT-WP-0013-T003
status: todo
priority: high
state_hub_task_id: "8c88b9a7-1e8d-401c-ad09-8b5a19ccba14"
```
Create shared framework types for extension execution:
- `ProcessingRequest`
- `ProcessingContext`
- `ProcessingResult`
- `ProcessingDiagnostic`
- `ProcessingCapability`
- `ProcessingProvenance`
- optional `ProcessingTrace`
The model should support deterministic, assisted, external, and read-only
operations without making every extension depend on every subsystem.
Output: framework module, tests, and migration guide for current subsystems.
## P13.4 - Implement extension descriptors and registries
```task
id: MKTT-WP-0013-T004
status: todo
priority: high
state_hub_task_id: "3fb2fe81-9819-4679-99d0-ad60ac9e8277"
```
Define descriptor objects for extensions:
- stable id
- kind
- version
- implementation reference
- capabilities
- optional dependencies
- safety/policy flags
- input and output contracts
- CLI/API affordances
- docs/examples links
Implement registries that can be assembled from in-package extension modules
and, later, package entry points.
Output: descriptor schema, registry API, duplicate/missing dependency
diagnostics, and tests.
## P13.5 - Add callback hooks and execution lifecycle
```task
id: MKTT-WP-0013-T005
status: todo
priority: medium
state_hub_task_id: "be8f2056-f413-44f9-be9c-6046c34e307e"
```
Add lifecycle callbacks for:
- before execution
- after success
- after diagnostic failure
- provenance capture
- cache key calculation
- capability/policy checks
- trace/event emission
Callbacks must be explicit and deterministic by default. They should not become
hidden global behavior.
Output: callback model and tests with fake extensions.
## P13.6 - Refactor query engines behind registry
```task
id: MKTT-WP-0013-T006
status: todo
priority: high
state_hub_task_id: "0226c1d1-f583-43ad-8e20-f75f9790e17d"
```
Move selector and JSONPath engines behind a query-engine registry while
preserving `query_document`, `extract_document`, `mkt query`, and `mkt extract`
compatibility.
Output: registered selector/jsonpath engines, compatibility shims, and tests.
## P13.7 - Refactor processors and local backend as registered extensions
```task
id: MKTT-WP-0013-T007
status: todo
priority: medium
state_hub_task_id: "a966dcbb-3ae8-47bf-85c8-4ba6ddcf7a31"
```
Adapt existing processor and backend infrastructure to expose descriptors and
registry entries without changing their external behavior.
Focus areas:
- deterministic fenced processors
- local SQLite index backend
- backend manifests
- FTS search
- snapshot refresh planning
Output: extension-backed processor/backend registration and regression tests.
## P13.8 - Refactor CLI composition to reduce central wiring
```task
id: MKTT-WP-0013-T008
status: todo
priority: medium
state_hub_task_id: "3e88ca62-8dba-4632-b5d0-29827d102322"
```
Reduce direct growth pressure in `cli/main.py` by allowing extension modules to
register command groups or command specs through a small, testable integration
point.
Output: CLI extension hook, migrated command group examples, and unchanged
public CLI behavior.
## P13.9 - Document extension authoring conventions
```task
id: MKTT-WP-0013-T009
status: todo
priority: medium
state_hub_task_id: "848e2a5e-c32b-4a94-906b-dc6aced4c71b"
```
Document how a new internal extension should be structured:
- specification file
- implementation module
- registration descriptor
- tests
- docs/examples
- diagnostics and provenance expectations
- optional dependency handling
- policy/capability declarations
Output: extension authoring guide and one small template/example extension.
## Exit Criteria
- Existing behavior is covered by characterization tests before refactoring.
- Optional features can live in well-contained modules with descriptors.
- Central CLI/query/backend files stop being the primary integration surface for
every new feature.
- The canonical processing model provides shared context/result/diagnostic/
provenance semantics without overfitting to pipelines.
- The framework is clearly distinct from business-facing workflow orchestration.
- Existing public commands and library APIs remain compatible or have explicit
compatibility shims.