generated from coulomb/repo-seed
337 lines
11 KiB
Markdown
337 lines
11 KiB
Markdown
---
|
|
id: MKTT-WP-0013
|
|
type: workplan
|
|
title: "Internal Extension Framework and Canonical Processing Model"
|
|
domain: markitect
|
|
status: done
|
|
owner: markitect-tool
|
|
topic_slug: markitect
|
|
planning_priority: P1
|
|
planning_order: 65
|
|
depends_on_workplans:
|
|
- MKTT-WP-0003
|
|
- MKTT-WP-0004
|
|
- MKTT-WP-0006
|
|
- MKTT-WP-0007
|
|
- MKTT-WP-0010
|
|
related_workplans:
|
|
- MKTT-WP-0005
|
|
- MKTT-WP-0009
|
|
- MKTT-WP-0011
|
|
- MKTT-WP-0012
|
|
created: "2026-05-04"
|
|
updated: "2026-05-04"
|
|
state_hub_workstream_id: "5eea103f-f584-4360-b7e3-c5b09a4814bd"
|
|
---
|
|
|
|
# MKTT-WP-0013: Internal Extension Framework and Canonical Processing Model
|
|
|
|
## Purpose
|
|
|
|
Create an internal extension framework that lets optional Markitect features
|
|
register well-contained implementations, descriptors, callbacks, diagnostics,
|
|
capabilities, and CLI/query integration points without repeatedly expanding
|
|
central modules.
|
|
|
|
This workplan is about internal extensibility and framework shape. It is
|
|
distinct from `MKTT-WP-0011`, which organizes business-facing dataflow pipelines.
|
|
|
|
## Background
|
|
|
|
Recent implementation work added valuable optional functionality:
|
|
|
|
- processor registry and deterministic fenced-block processors
|
|
- backend manifests and local SQLite backend
|
|
- selector and optional JSONPath query engines
|
|
- FTS search over indexed sections and blocks
|
|
- content references, literate workflows, explode/implode, and content classes
|
|
|
|
The functionality is working, but extension pressure is visible. Optional
|
|
features still tend to require edits in central files such as CLI wiring, query
|
|
exports, backend exports, and shared command dispatch. That is acceptable early
|
|
in a small toolkit, but it becomes a maintenance liability if Markitect is meant
|
|
to grow into a research lab for sophisticated Markdown/knowledge systems.
|
|
|
|
The target architecture should preserve the current slim core while making
|
|
extensions feel first-class:
|
|
|
|
```text
|
|
specification file + implementation module + registration descriptor
|
|
-> extension registry
|
|
-> canonical processing request/context/result
|
|
-> callbacks, diagnostics, provenance, capabilities
|
|
-> CLI/API/query/backend integration
|
|
```
|
|
|
|
## Decision
|
|
|
|
Yes, restructure, but do it deliberately:
|
|
|
|
1. Add characterization tests for the current behaviors before refactoring.
|
|
2. Define a canonical processing model that extensions can share.
|
|
3. Introduce extension descriptors and registries with minimal central wiring.
|
|
4. Migrate one vertical slice at a time.
|
|
5. Keep compatibility aliases and existing CLI commands stable.
|
|
|
|
Avoid a plugin system that is more elaborate than the project needs. The first
|
|
version should support internal extension isolation and later package-level
|
|
discovery without forcing dynamic loading or external dependency installation.
|
|
|
|
## P13.1 - Architecture note and extension taxonomy
|
|
|
|
```task
|
|
id: MKTT-WP-0013-T001
|
|
status: done
|
|
priority: high
|
|
state_hub_task_id: "ba106001-c953-435a-8012-0dd83533d309"
|
|
```
|
|
|
|
Define the internal extension taxonomy:
|
|
|
|
- query engines
|
|
- processors
|
|
- backends and index stores
|
|
- references and content-unit providers
|
|
- validators and contract checks
|
|
- templates/generation adapters
|
|
- CLI command groups
|
|
- future render/export adapters
|
|
- future document functions
|
|
|
|
Output: architecture note explaining extension boundaries, lifecycle,
|
|
registration semantics, and relationship to `MKTT-WP-0011`.
|
|
|
|
Implemented: `docs/internal-extension-framework.md` defines the internal
|
|
extension boundary, extension taxonomy, canonical lifecycle, descriptor shape,
|
|
processing model, registration strategy, compatibility rules, and
|
|
characterization coverage.
|
|
|
|
## P13.2 - Add characterization tests before refactor
|
|
|
|
```task
|
|
id: MKTT-WP-0013-T002
|
|
status: done
|
|
priority: high
|
|
state_hub_task_id: "a270cb7a-4dbf-4562-b0ab-d5dda5124086"
|
|
```
|
|
|
|
Lock down current behavior before moving code behind registries:
|
|
|
|
- selector query and extraction
|
|
- optional JSONPath diagnostics
|
|
- processor registry behavior
|
|
- backend manifest registry
|
|
- local SQLite snapshot/index/search behavior
|
|
- content reference resolution
|
|
- key CLI commands and output envelopes
|
|
- provenance and diagnostics shapes
|
|
|
|
Output: focused characterization tests that can fail loudly if refactoring
|
|
changes public behavior.
|
|
|
|
Implemented: `tests/test_extension_characterization.py` covers selector
|
|
query/extraction, JSONPath optional-dependency diagnostics, processor
|
|
provenance and diagnostics, backend manifest/capability behavior, local
|
|
snapshot/index/search behavior, content references, and representative CLI
|
|
output envelopes.
|
|
|
|
## P13.3 - Define canonical processing model
|
|
|
|
```task
|
|
id: MKTT-WP-0013-T003
|
|
status: done
|
|
priority: high
|
|
state_hub_task_id: "8c88b9a7-1e8d-401c-ad09-8b5a19ccba14"
|
|
```
|
|
|
|
Create shared framework types for extension execution:
|
|
|
|
- `ProcessingRequest`
|
|
- `ProcessingContext`
|
|
- `ProcessingResult`
|
|
- `ProcessingDiagnostic`
|
|
- `ProcessingCapability`
|
|
- `ProcessingProvenance`
|
|
- optional `ProcessingTrace`
|
|
|
|
The model should support deterministic, assisted, external, and read-only
|
|
operations without making every extension depend on every subsystem.
|
|
|
|
Output: framework module, tests, and migration guide for current subsystems.
|
|
|
|
Implemented: `markitect_tool.extension.processing` defines
|
|
`ProcessingRequest`, `ProcessingContext`, `ProcessingResult`,
|
|
`ProcessingDiagnostic`, `ProcessingCapability`, `ProcessingProvenance`, and
|
|
`ProcessingTrace`, with serialization, cache-key, validity, provenance, trace,
|
|
and error normalization tests.
|
|
|
|
## P13.4 - Implement extension descriptors and registries
|
|
|
|
```task
|
|
id: MKTT-WP-0013-T004
|
|
status: done
|
|
priority: high
|
|
state_hub_task_id: "3fb2fe81-9819-4679-99d0-ad60ac9e8277"
|
|
```
|
|
|
|
Define descriptor objects for extensions:
|
|
|
|
- stable id
|
|
- kind
|
|
- version
|
|
- implementation reference
|
|
- capabilities
|
|
- optional dependencies
|
|
- safety/policy flags
|
|
- input and output contracts
|
|
- CLI/API affordances
|
|
- docs/examples links
|
|
|
|
Implement registries that can be assembled from in-package extension modules
|
|
and, later, package entry points.
|
|
|
|
Output: descriptor schema, registry API, duplicate/missing dependency
|
|
diagnostics, and tests.
|
|
|
|
Implemented: `markitect_tool.extension.registry` defines
|
|
`ExtensionDescriptor`, `OptionalDependency`, `ExtensionRegistry`,
|
|
`ExtensionDependencyCheck`, and `ExtensionRegistryError`, with descriptor
|
|
serialization, kind/capability lookup, duplicate-id diagnostics, dependency
|
|
checks, and factory instantiation tests.
|
|
|
|
## P13.5 - Add callback hooks and execution lifecycle
|
|
|
|
```task
|
|
id: MKTT-WP-0013-T005
|
|
status: done
|
|
priority: medium
|
|
state_hub_task_id: "be8f2056-f413-44f9-be9c-6046c34e307e"
|
|
```
|
|
|
|
Add lifecycle callbacks for:
|
|
|
|
- before execution
|
|
- after success
|
|
- after diagnostic failure
|
|
- provenance capture
|
|
- cache key calculation
|
|
- capability/policy checks
|
|
- trace/event emission
|
|
|
|
Callbacks must be explicit and deterministic by default. They should not become
|
|
hidden global behavior.
|
|
|
|
Output: callback model and tests with fake extensions.
|
|
|
|
Implemented: `ExtensionLifecycle` and `ExtensionExecutor` provide explicit
|
|
before/success/failure/after callbacks, dependency checks before execution,
|
|
result type normalization, execution trace emission, and fake-extension tests.
|
|
|
|
## P13.6 - Refactor query engines behind registry
|
|
|
|
```task
|
|
id: MKTT-WP-0013-T006
|
|
status: done
|
|
priority: high
|
|
state_hub_task_id: "0226c1d1-f583-43ad-8e20-f75f9790e17d"
|
|
```
|
|
|
|
Move selector and JSONPath engines behind a query-engine registry while
|
|
preserving `query_document`, `extract_document`, `mkt query`, and `mkt extract`
|
|
compatibility.
|
|
|
|
Output: registered selector/jsonpath engines, compatibility shims, and tests.
|
|
|
|
Implemented: selector and JSONPath engines now live behind
|
|
`QueryEngineRegistry` descriptors, with compatibility shims for
|
|
`query_document`, `extract_document`, `query_document_jsonpath`, and
|
|
`extract_document_jsonpath`; CLI behavior remains unchanged.
|
|
|
|
## P13.7 - Refactor processors and local backend as registered extensions
|
|
|
|
```task
|
|
id: MKTT-WP-0013-T007
|
|
status: done
|
|
priority: medium
|
|
state_hub_task_id: "a966dcbb-3ae8-47bf-85c8-4ba6ddcf7a31"
|
|
```
|
|
|
|
Adapt existing processor and backend infrastructure to expose descriptors and
|
|
registry entries without changing their external behavior.
|
|
|
|
Focus areas:
|
|
|
|
- deterministic fenced processors
|
|
- local SQLite index backend
|
|
- backend manifests
|
|
- FTS search
|
|
- snapshot refresh planning
|
|
|
|
Output: extension-backed processor/backend registration and regression tests.
|
|
|
|
Implemented: `builtin_extension_registry()` now exposes built-in query engines,
|
|
deterministic processors, and the local SQLite backend as extension
|
|
descriptors with capabilities, safety flags, CLI affordances, docs/examples,
|
|
diagnostic namespaces, and provenance prefixes.
|
|
|
|
## P13.8 - Refactor CLI composition to reduce central wiring
|
|
|
|
```task
|
|
id: MKTT-WP-0013-T008
|
|
status: done
|
|
priority: medium
|
|
state_hub_task_id: "3e88ca62-8dba-4632-b5d0-29827d102322"
|
|
```
|
|
|
|
Reduce direct growth pressure in `cli/main.py` by allowing extension modules to
|
|
register command groups or command specs through a small, testable integration
|
|
point.
|
|
|
|
Output: CLI extension hook, migrated command group examples, and unchanged
|
|
public CLI behavior.
|
|
|
|
Implemented first integration point: `markitect_tool.cli.extensions` derives
|
|
`CliCommandSpec` declarations from extension descriptors. Built-in query,
|
|
processor, and backend descriptors now expose command affordances such as
|
|
`mkt query`, `mkt process`, `mkt cache index`, and `mkt search` without making
|
|
the CLI module the only source of command metadata.
|
|
|
|
## P13.9 - Document extension authoring conventions
|
|
|
|
```task
|
|
id: MKTT-WP-0013-T009
|
|
status: done
|
|
priority: medium
|
|
state_hub_task_id: "848e2a5e-c32b-4a94-906b-dc6aced4c71b"
|
|
```
|
|
|
|
Document how a new internal extension should be structured:
|
|
|
|
- specification file
|
|
- implementation module
|
|
- registration descriptor
|
|
- tests
|
|
- docs/examples
|
|
- diagnostics and provenance expectations
|
|
- optional dependency handling
|
|
- policy/capability declarations
|
|
|
|
Output: extension authoring guide and one small template/example extension.
|
|
|
|
Implemented: `docs/extension-authoring.md` documents extension layout,
|
|
descriptor template, optional dependency declarations, processing envelopes,
|
|
diagnostics, provenance, safety/policy metadata, CLI affordances, tests, and
|
|
the boundary with business-facing workflows.
|
|
|
|
## Exit Criteria
|
|
|
|
- Existing behavior is covered by characterization tests before refactoring.
|
|
- Optional features can live in well-contained modules with descriptors.
|
|
- Central CLI/query/backend files stop being the primary integration surface for
|
|
every new feature.
|
|
- The canonical processing model provides shared context/result/diagnostic/
|
|
provenance semantics without overfitting to pipelines.
|
|
- The framework is clearly distinct from business-facing workflow orchestration.
|
|
- Existing public commands and library APIs remain compatible or have explicit
|
|
compatibility shims.
|