generated from coulomb/repo-seed
235 lines
6.2 KiB
Markdown
235 lines
6.2 KiB
Markdown
# Internal Extension Authoring
|
|
|
|
## Purpose
|
|
|
|
This guide describes how to add a new internal Markitect extension without
|
|
turning central modules into the main integration surface.
|
|
|
|
Use this for internal query engines, processors, backend/index stores,
|
|
reference providers, validators, template/generation adapters, CLI command
|
|
groups, render/export adapters, and future document functions.
|
|
|
|
Source-format adapters are external package extensions. Use
|
|
`docs/source-adapter-contract.md` for the source adapter protocol, entry point
|
|
group, descriptor shape, and contract-test expectations.
|
|
|
|
## Recommended Shape
|
|
|
|
Each extension should have:
|
|
|
|
- implementation module
|
|
- descriptor or descriptor factory
|
|
- focused tests
|
|
- characterization coverage if it changes existing behavior
|
|
- documentation or example link
|
|
- diagnostic namespace
|
|
- provenance operation prefix
|
|
- optional dependency declaration
|
|
- capability and safety declarations
|
|
|
|
Prefer this shape:
|
|
|
|
```text
|
|
src/markitect_tool/<area>/<feature>.py
|
|
tests/test_<area>_<feature>.py
|
|
docs/<feature>.md
|
|
```
|
|
|
|
If the extension is cross-cutting, register it from
|
|
`markitect_tool.extension.builtins` or a future internal discovery module rather
|
|
than importing it from many central files.
|
|
|
|
## Descriptor Template
|
|
|
|
```python
|
|
from markitect_tool.extension import ExtensionDescriptor, ProcessingCapability
|
|
|
|
|
|
def my_extension_descriptor() -> ExtensionDescriptor:
|
|
return ExtensionDescriptor(
|
|
id="query.example",
|
|
kind="query-engine",
|
|
summary="Example query engine.",
|
|
capabilities=[
|
|
ProcessingCapability(id="ast", kind="read"),
|
|
],
|
|
input_contract="Document + example expression",
|
|
output_contract="QueryMatch[]",
|
|
diagnostics_namespace="query.example",
|
|
provenance_prefix="query.example",
|
|
cli={"commands": ["mkt query --engine example"]},
|
|
docs=["docs/example-query.md"],
|
|
examples=["examples/query/example.md"],
|
|
)
|
|
```
|
|
|
|
## Optional Dependencies
|
|
|
|
Declare optional dependencies in descriptors:
|
|
|
|
```python
|
|
from markitect_tool.extension import OptionalDependency
|
|
|
|
OptionalDependency(
|
|
name="jsonpath_ng",
|
|
package="jsonpath-ng",
|
|
extra="query",
|
|
required=True,
|
|
purpose="Evaluate JSONPath expressions.",
|
|
)
|
|
```
|
|
|
|
If a dependency is missing, return a structured diagnostic. Do not fail with an
|
|
unexplained import error.
|
|
|
|
## Processing Envelopes
|
|
|
|
Use canonical processing envelopes where an extension needs a shared execution
|
|
boundary:
|
|
|
|
- `ProcessingRequest`
|
|
- `ProcessingContext`
|
|
- `ProcessingResult`
|
|
- `ProcessingCapability`
|
|
- `ProcessingProvenance`
|
|
- `ProcessingTrace`
|
|
|
|
Subsystem-specific dataclasses may remain richer. The canonical model is the
|
|
bridge that lets callbacks, registries, diagnostics, provenance, and future
|
|
policy checks interact consistently.
|
|
|
|
### Minimal Runnable Extension
|
|
|
|
```python
|
|
from markitect_tool.extension import (
|
|
ExtensionDescriptor,
|
|
ExtensionExecutor,
|
|
ExtensionRegistry,
|
|
ProcessingRequest,
|
|
ProcessingResult,
|
|
)
|
|
|
|
|
|
def run_example(request: ProcessingRequest) -> ProcessingResult:
|
|
name = request.input.get("name", "world")
|
|
return ProcessingResult(output=f"Hello, {name}")
|
|
|
|
|
|
descriptor = ExtensionDescriptor(
|
|
id="example.hello",
|
|
kind="example",
|
|
summary="Small example extension.",
|
|
factory=lambda: run_example,
|
|
)
|
|
|
|
registry = ExtensionRegistry([descriptor])
|
|
result = ExtensionExecutor(registry).execute(
|
|
"example.hello",
|
|
ProcessingRequest(operation="example.hello", input={"name": "Markitect"}),
|
|
)
|
|
```
|
|
|
|
Use this executor boundary when callbacks, dependency checks, trace events, or
|
|
future policy checks matter. For tiny deterministic helpers, it is still fine to
|
|
keep the existing direct function API and expose a descriptor alongside it.
|
|
|
|
### Cache-Key Rules
|
|
|
|
`ProcessingRequest.cache_key` includes:
|
|
|
|
- operation
|
|
- input
|
|
- stable context material
|
|
- options
|
|
- scope
|
|
- declared capabilities
|
|
- request metadata
|
|
|
|
Stable context material includes source path, namespaces, variables, policy, and
|
|
metadata. It does not include workspace root, caller, or live backend handles.
|
|
This keeps cache keys portable while avoiding collisions for context-sensitive
|
|
operations.
|
|
|
|
## Diagnostics
|
|
|
|
Diagnostics should be:
|
|
|
|
- stable enough for tests and callers
|
|
- namespaced by subsystem or extension
|
|
- explicit about optional dependency failures
|
|
- tied to source locations where possible
|
|
- emitted as `Diagnostic` or `ProcessingResult.from_error`
|
|
|
|
Recommended code style:
|
|
|
|
```text
|
|
<extension-kind>.<condition>
|
|
query.invalid_jsonpath
|
|
processor.unknown
|
|
extension.missing_dependency
|
|
backend.local_sqlite.invalid_fts_query
|
|
```
|
|
|
|
## Provenance
|
|
|
|
Every extension that transforms, queries, reads, writes, generates, or indexes
|
|
content should expose provenance. Use a stable operation prefix:
|
|
|
|
```text
|
|
query.selector
|
|
query.jsonpath
|
|
processor.include
|
|
local_snapshot_store.put_file
|
|
```
|
|
|
|
Include source path, content hash, snapshot id, backend/provider id, and
|
|
dependencies when known.
|
|
|
|
## Safety And Policy
|
|
|
|
Descriptors should declare safety-relevant behavior:
|
|
|
|
- reads files
|
|
- writes local cache
|
|
- writes user output files
|
|
- accesses network
|
|
- invokes external process
|
|
- calls assisted-generation provider
|
|
- transmits content outside the local process
|
|
|
|
The initial framework records this metadata. Later policy layers can enforce it.
|
|
|
|
## CLI Affordances
|
|
|
|
If an extension exposes CLI behavior, declare it in `descriptor.cli`:
|
|
|
|
```python
|
|
cli={"commands": ["mkt cache index", "mkt search"]}
|
|
```
|
|
|
|
`markitect_tool.cli.extensions.collect_cli_command_specs()` can inspect these
|
|
affordances without importing Click command implementations.
|
|
|
|
## Testing Checklist
|
|
|
|
Add tests for:
|
|
|
|
- descriptor serialization
|
|
- registry lookup and duplicate handling
|
|
- missing optional dependency diagnostics
|
|
- canonical result validity
|
|
- provenance shape
|
|
- CLI output envelope if public commands are exposed
|
|
- compatibility shim if replacing an existing API
|
|
|
|
When refactoring an existing feature, add characterization tests first, then
|
|
migrate implementation behind descriptors or registries.
|
|
|
|
## Boundary With Workflows
|
|
|
|
Internal extensions describe what Markitect can do. Workflows describe how a
|
|
user combines capabilities for a concrete document pipeline.
|
|
|
|
An extension may expose a workflow step later, but it should not depend on the
|
|
workflow engine to be useful from the library or CLI.
|