coulomb/markitect-tool

Fork 0

generated from coulomb/repo-seed

Files

tegwick 33fa602fe5 Extension framework optimization

2026-05-04 11:49:39 +02:00

6.0 KiB

Raw Blame History

Internal Extension Authoring

Purpose

This guide describes how to add a new internal Markitect extension without turning central modules into the main integration surface.

Use this for internal query engines, processors, backend/index stores, reference providers, validators, template/generation adapters, CLI command groups, render/export adapters, and future document functions.

Recommended Shape

Each extension should have:

implementation module
descriptor or descriptor factory
focused tests
characterization coverage if it changes existing behavior
documentation or example link
diagnostic namespace
provenance operation prefix
optional dependency declaration
capability and safety declarations

Prefer this shape:

src/markitect_tool/<area>/<feature>.py
tests/test_<area>_<feature>.py
docs/<feature>.md

If the extension is cross-cutting, register it from markitect_tool.extension.builtins or a future internal discovery module rather than importing it from many central files.

Descriptor Template

from markitect_tool.extension import ExtensionDescriptor, ProcessingCapability


def my_extension_descriptor() -> ExtensionDescriptor:
    return ExtensionDescriptor(
        id="query.example",
        kind="query-engine",
        summary="Example query engine.",
        capabilities=[
            ProcessingCapability(id="ast", kind="read"),
        ],
        input_contract="Document + example expression",
        output_contract="QueryMatch[]",
        diagnostics_namespace="query.example",
        provenance_prefix="query.example",
        cli={"commands": ["mkt query --engine example"]},
        docs=["docs/example-query.md"],
        examples=["examples/query/example.md"],
    )

Optional Dependencies

Declare optional dependencies in descriptors:

from markitect_tool.extension import OptionalDependency

OptionalDependency(
    name="jsonpath_ng",
    package="jsonpath-ng",
    extra="query",
    required=True,
    purpose="Evaluate JSONPath expressions.",
)

If a dependency is missing, return a structured diagnostic. Do not fail with an unexplained import error.

Processing Envelopes

Use canonical processing envelopes where an extension needs a shared execution boundary:

ProcessingRequest
ProcessingContext
ProcessingResult
ProcessingCapability
ProcessingProvenance
ProcessingTrace

Subsystem-specific dataclasses may remain richer. The canonical model is the bridge that lets callbacks, registries, diagnostics, provenance, and future policy checks interact consistently.

Minimal Runnable Extension

from markitect_tool.extension import (
    ExtensionDescriptor,
    ExtensionExecutor,
    ExtensionRegistry,
    ProcessingRequest,
    ProcessingResult,
)


def run_example(request: ProcessingRequest) -> ProcessingResult:
    name = request.input.get("name", "world")
    return ProcessingResult(output=f"Hello, {name}")


descriptor = ExtensionDescriptor(
    id="example.hello",
    kind="example",
    summary="Small example extension.",
    factory=lambda: run_example,
)

registry = ExtensionRegistry([descriptor])
result = ExtensionExecutor(registry).execute(
    "example.hello",
    ProcessingRequest(operation="example.hello", input={"name": "Markitect"}),
)

Use this executor boundary when callbacks, dependency checks, trace events, or future policy checks matter. For tiny deterministic helpers, it is still fine to keep the existing direct function API and expose a descriptor alongside it.

Cache-Key Rules

ProcessingRequest.cache_key includes:

operation
input
stable context material
options
scope
declared capabilities
request metadata

Stable context material includes source path, namespaces, variables, policy, and metadata. It does not include workspace root, caller, or live backend handles. This keeps cache keys portable while avoiding collisions for context-sensitive operations.

Diagnostics

Diagnostics should be:

stable enough for tests and callers
namespaced by subsystem or extension
explicit about optional dependency failures
tied to source locations where possible
emitted as Diagnostic or ProcessingResult.from_error

Recommended code style:

<extension-kind>.<condition>
query.invalid_jsonpath
processor.unknown
extension.missing_dependency
backend.local_sqlite.invalid_fts_query

Provenance

Every extension that transforms, queries, reads, writes, generates, or indexes content should expose provenance. Use a stable operation prefix:

query.selector
query.jsonpath
processor.include
local_snapshot_store.put_file

Include source path, content hash, snapshot id, backend/provider id, and dependencies when known.

Safety And Policy

Descriptors should declare safety-relevant behavior:

reads files
writes local cache
writes user output files
accesses network
invokes external process
calls assisted-generation provider
transmits content outside the local process

The initial framework records this metadata. Later policy layers can enforce it.

CLI Affordances

If an extension exposes CLI behavior, declare it in descriptor.cli:

cli={"commands": ["mkt cache index", "mkt search"]}

markitect_tool.cli.extensions.collect_cli_command_specs() can inspect these affordances without importing Click command implementations.

Testing Checklist

Add tests for:

descriptor serialization
registry lookup and duplicate handling
missing optional dependency diagnostics
canonical result validity
provenance shape
CLI output envelope if public commands are exposed
compatibility shim if replacing an existing API

When refactoring an existing feature, add characterization tests first, then migrate implementation behind descriptors or registries.

Boundary With Workflows

Internal extensions describe what Markitect can do. Workflows describe how a user combines capabilities for a concrete document pipeline.

An extension may expose a workflow step later, but it should not depend on the workflow engine to be useful from the library or CLI.

6.0 KiB Raw Blame History