generated from coulomb/repo-seed
Extensible canonical internal processing refactoring
This commit is contained in:
178
docs/extension-authoring.md
Normal file
178
docs/extension-authoring.md
Normal file
@@ -0,0 +1,178 @@
|
||||
# Internal Extension Authoring
|
||||
|
||||
## Purpose
|
||||
|
||||
This guide describes how to add a new internal Markitect extension without
|
||||
turning central modules into the main integration surface.
|
||||
|
||||
Use this for internal query engines, processors, backend/index stores,
|
||||
reference providers, validators, template/generation adapters, CLI command
|
||||
groups, render/export adapters, and future document functions.
|
||||
|
||||
## Recommended Shape
|
||||
|
||||
Each extension should have:
|
||||
|
||||
- implementation module
|
||||
- descriptor or descriptor factory
|
||||
- focused tests
|
||||
- characterization coverage if it changes existing behavior
|
||||
- documentation or example link
|
||||
- diagnostic namespace
|
||||
- provenance operation prefix
|
||||
- optional dependency declaration
|
||||
- capability and safety declarations
|
||||
|
||||
Prefer this shape:
|
||||
|
||||
```text
|
||||
src/markitect_tool/<area>/<feature>.py
|
||||
tests/test_<area>_<feature>.py
|
||||
docs/<feature>.md
|
||||
```
|
||||
|
||||
If the extension is cross-cutting, register it from
|
||||
`markitect_tool.extension.builtins` or a future internal discovery module rather
|
||||
than importing it from many central files.
|
||||
|
||||
## Descriptor Template
|
||||
|
||||
```python
|
||||
from markitect_tool.extension import ExtensionDescriptor, ProcessingCapability
|
||||
|
||||
|
||||
def my_extension_descriptor() -> ExtensionDescriptor:
|
||||
return ExtensionDescriptor(
|
||||
id="query.example",
|
||||
kind="query-engine",
|
||||
summary="Example query engine.",
|
||||
capabilities=[
|
||||
ProcessingCapability(id="ast", kind="read"),
|
||||
],
|
||||
input_contract="Document + example expression",
|
||||
output_contract="QueryMatch[]",
|
||||
diagnostics_namespace="query.example",
|
||||
provenance_prefix="query.example",
|
||||
cli={"commands": ["mkt query --engine example"]},
|
||||
docs=["docs/example-query.md"],
|
||||
examples=["examples/query/example.md"],
|
||||
)
|
||||
```
|
||||
|
||||
## Optional Dependencies
|
||||
|
||||
Declare optional dependencies in descriptors:
|
||||
|
||||
```python
|
||||
from markitect_tool.extension import OptionalDependency
|
||||
|
||||
OptionalDependency(
|
||||
name="jsonpath_ng",
|
||||
package="jsonpath-ng",
|
||||
extra="query",
|
||||
required=True,
|
||||
purpose="Evaluate JSONPath expressions.",
|
||||
)
|
||||
```
|
||||
|
||||
If a dependency is missing, return a structured diagnostic. Do not fail with an
|
||||
unexplained import error.
|
||||
|
||||
## Processing Envelopes
|
||||
|
||||
Use canonical processing envelopes where an extension needs a shared execution
|
||||
boundary:
|
||||
|
||||
- `ProcessingRequest`
|
||||
- `ProcessingContext`
|
||||
- `ProcessingResult`
|
||||
- `ProcessingCapability`
|
||||
- `ProcessingProvenance`
|
||||
- `ProcessingTrace`
|
||||
|
||||
Subsystem-specific dataclasses may remain richer. The canonical model is the
|
||||
bridge that lets callbacks, registries, diagnostics, provenance, and future
|
||||
policy checks interact consistently.
|
||||
|
||||
## Diagnostics
|
||||
|
||||
Diagnostics should be:
|
||||
|
||||
- stable enough for tests and callers
|
||||
- namespaced by subsystem or extension
|
||||
- explicit about optional dependency failures
|
||||
- tied to source locations where possible
|
||||
- emitted as `Diagnostic` or `ProcessingResult.from_error`
|
||||
|
||||
Recommended code style:
|
||||
|
||||
```text
|
||||
<extension-kind>.<condition>
|
||||
query.invalid_jsonpath
|
||||
processor.unknown
|
||||
extension.missing_dependency
|
||||
backend.local_sqlite.invalid_fts_query
|
||||
```
|
||||
|
||||
## Provenance
|
||||
|
||||
Every extension that transforms, queries, reads, writes, generates, or indexes
|
||||
content should expose provenance. Use a stable operation prefix:
|
||||
|
||||
```text
|
||||
query.selector
|
||||
query.jsonpath
|
||||
processor.include
|
||||
local_snapshot_store.put_file
|
||||
```
|
||||
|
||||
Include source path, content hash, snapshot id, backend/provider id, and
|
||||
dependencies when known.
|
||||
|
||||
## Safety And Policy
|
||||
|
||||
Descriptors should declare safety-relevant behavior:
|
||||
|
||||
- reads files
|
||||
- writes local cache
|
||||
- writes user output files
|
||||
- accesses network
|
||||
- invokes external process
|
||||
- calls assisted-generation provider
|
||||
- transmits content outside the local process
|
||||
|
||||
The initial framework records this metadata. Later policy layers can enforce it.
|
||||
|
||||
## CLI Affordances
|
||||
|
||||
If an extension exposes CLI behavior, declare it in `descriptor.cli`:
|
||||
|
||||
```python
|
||||
cli={"commands": ["mkt cache index", "mkt search"]}
|
||||
```
|
||||
|
||||
`markitect_tool.cli.extensions.collect_cli_command_specs()` can inspect these
|
||||
affordances without importing Click command implementations.
|
||||
|
||||
## Testing Checklist
|
||||
|
||||
Add tests for:
|
||||
|
||||
- descriptor serialization
|
||||
- registry lookup and duplicate handling
|
||||
- missing optional dependency diagnostics
|
||||
- canonical result validity
|
||||
- provenance shape
|
||||
- CLI output envelope if public commands are exposed
|
||||
- compatibility shim if replacing an existing API
|
||||
|
||||
When refactoring an existing feature, add characterization tests first, then
|
||||
migrate implementation behind descriptors or registries.
|
||||
|
||||
## Boundary With Workflows
|
||||
|
||||
Internal extensions describe what Markitect can do. Workflows describe how a
|
||||
user combines capabilities for a concrete document pipeline.
|
||||
|
||||
An extension may expose a workflow step later, but it should not depend on the
|
||||
workflow engine to be useful from the library or CLI.
|
||||
Reference in New Issue
Block a user