Files
markitect-tool/docs/extension-authoring.md

179 lines
4.7 KiB
Markdown

# Internal Extension Authoring
## Purpose
This guide describes how to add a new internal Markitect extension without
turning central modules into the main integration surface.
Use this for internal query engines, processors, backend/index stores,
reference providers, validators, template/generation adapters, CLI command
groups, render/export adapters, and future document functions.
## Recommended Shape
Each extension should have:
- implementation module
- descriptor or descriptor factory
- focused tests
- characterization coverage if it changes existing behavior
- documentation or example link
- diagnostic namespace
- provenance operation prefix
- optional dependency declaration
- capability and safety declarations
Prefer this shape:
```text
src/markitect_tool/<area>/<feature>.py
tests/test_<area>_<feature>.py
docs/<feature>.md
```
If the extension is cross-cutting, register it from
`markitect_tool.extension.builtins` or a future internal discovery module rather
than importing it from many central files.
## Descriptor Template
```python
from markitect_tool.extension import ExtensionDescriptor, ProcessingCapability
def my_extension_descriptor() -> ExtensionDescriptor:
return ExtensionDescriptor(
id="query.example",
kind="query-engine",
summary="Example query engine.",
capabilities=[
ProcessingCapability(id="ast", kind="read"),
],
input_contract="Document + example expression",
output_contract="QueryMatch[]",
diagnostics_namespace="query.example",
provenance_prefix="query.example",
cli={"commands": ["mkt query --engine example"]},
docs=["docs/example-query.md"],
examples=["examples/query/example.md"],
)
```
## Optional Dependencies
Declare optional dependencies in descriptors:
```python
from markitect_tool.extension import OptionalDependency
OptionalDependency(
name="jsonpath_ng",
package="jsonpath-ng",
extra="query",
required=True,
purpose="Evaluate JSONPath expressions.",
)
```
If a dependency is missing, return a structured diagnostic. Do not fail with an
unexplained import error.
## Processing Envelopes
Use canonical processing envelopes where an extension needs a shared execution
boundary:
- `ProcessingRequest`
- `ProcessingContext`
- `ProcessingResult`
- `ProcessingCapability`
- `ProcessingProvenance`
- `ProcessingTrace`
Subsystem-specific dataclasses may remain richer. The canonical model is the
bridge that lets callbacks, registries, diagnostics, provenance, and future
policy checks interact consistently.
## Diagnostics
Diagnostics should be:
- stable enough for tests and callers
- namespaced by subsystem or extension
- explicit about optional dependency failures
- tied to source locations where possible
- emitted as `Diagnostic` or `ProcessingResult.from_error`
Recommended code style:
```text
<extension-kind>.<condition>
query.invalid_jsonpath
processor.unknown
extension.missing_dependency
backend.local_sqlite.invalid_fts_query
```
## Provenance
Every extension that transforms, queries, reads, writes, generates, or indexes
content should expose provenance. Use a stable operation prefix:
```text
query.selector
query.jsonpath
processor.include
local_snapshot_store.put_file
```
Include source path, content hash, snapshot id, backend/provider id, and
dependencies when known.
## Safety And Policy
Descriptors should declare safety-relevant behavior:
- reads files
- writes local cache
- writes user output files
- accesses network
- invokes external process
- calls assisted-generation provider
- transmits content outside the local process
The initial framework records this metadata. Later policy layers can enforce it.
## CLI Affordances
If an extension exposes CLI behavior, declare it in `descriptor.cli`:
```python
cli={"commands": ["mkt cache index", "mkt search"]}
```
`markitect_tool.cli.extensions.collect_cli_command_specs()` can inspect these
affordances without importing Click command implementations.
## Testing Checklist
Add tests for:
- descriptor serialization
- registry lookup and duplicate handling
- missing optional dependency diagnostics
- canonical result validity
- provenance shape
- CLI output envelope if public commands are exposed
- compatibility shim if replacing an existing API
When refactoring an existing feature, add characterization tests first, then
migrate implementation behind descriptors or registries.
## Boundary With Workflows
Internal extensions describe what Markitect can do. Workflows describe how a
user combines capabilities for a concrete document pipeline.
An extension may expose a workflow step later, but it should not depend on the
workflow engine to be useful from the library or CLI.