generated from coulomb/repo-seed
Added deterministic function layer
This commit is contained in:
137
docs/document-functions.md
Normal file
137
docs/document-functions.md
Normal file
@@ -0,0 +1,137 @@
|
||||
# Document Function Layer
|
||||
|
||||
Date: 2026-05-04
|
||||
|
||||
## Purpose
|
||||
|
||||
Document functions are a Markdown-native authoring surface over existing
|
||||
Markitect primitives. They let a document author write small deterministic
|
||||
operations inline or in fenced blocks while preserving diagnostics,
|
||||
provenance, trace events, capability metadata, and extension descriptors.
|
||||
|
||||
The first implementation is intentionally conservative:
|
||||
|
||||
- deterministic functions only
|
||||
- no network access
|
||||
- no filesystem access
|
||||
- no external processes
|
||||
- no provider or assisted-generation calls
|
||||
- no live flex-auth or external authorization service required
|
||||
|
||||
Riskier functions can be added later as optional adapters once capability and
|
||||
policy gates are explicit.
|
||||
|
||||
## Syntax
|
||||
|
||||
Inline calls use:
|
||||
|
||||
```markdown
|
||||
{{mkt:text.upper "draft"}}
|
||||
```
|
||||
|
||||
Fenced block calls use:
|
||||
|
||||
````markdown
|
||||
```mkt-function md.codeblock lang=python
|
||||
print("hello")
|
||||
```
|
||||
````
|
||||
|
||||
Names are namespace-qualified. Arguments may be positional or named:
|
||||
|
||||
```markdown
|
||||
{{mkt:md.heading text="Decision" level=2}}
|
||||
```
|
||||
|
||||
Pipeline calls pass the previous result as the first argument of the next
|
||||
function:
|
||||
|
||||
```markdown
|
||||
{{mkt:text.upper "draft" | text.replace DRAFT Final}}
|
||||
```
|
||||
|
||||
Values of the form `${name}` are resolved from `ProcessingContext.variables`.
|
||||
This keeps data binding aligned with workflow expression conventions without
|
||||
creating a second workflow engine.
|
||||
|
||||
## Built-In Functions
|
||||
|
||||
Initial deterministic functions:
|
||||
|
||||
| Function | Purpose |
|
||||
| --- | --- |
|
||||
| `text.upper` | Uppercase text. |
|
||||
| `text.lower` | Lowercase text. |
|
||||
| `text.title` | Title-case text. |
|
||||
| `text.trim` | Trim surrounding whitespace. |
|
||||
| `text.replace` | Replace text. |
|
||||
| `text.join` | Join values with an optional separator. |
|
||||
| `md.heading` | Create a Markdown heading. |
|
||||
| `md.bold` | Create bold Markdown text. |
|
||||
| `md.link` | Create a Markdown link. |
|
||||
| `md.codeblock` | Create a fenced code block. |
|
||||
| `data.get` | Read a value from processing context variables. |
|
||||
|
||||
## CLI
|
||||
|
||||
List functions:
|
||||
|
||||
```text
|
||||
mkt function list
|
||||
```
|
||||
|
||||
Validate calls without rendering:
|
||||
|
||||
```text
|
||||
mkt function check examples/functions/basic-functions.md
|
||||
```
|
||||
|
||||
Render deterministic calls:
|
||||
|
||||
```text
|
||||
mkt function render examples/functions/basic-functions.md
|
||||
```
|
||||
|
||||
JSON and YAML outputs include calls, diagnostics, provenance, and trace data.
|
||||
|
||||
## Registry And Extension Fit
|
||||
|
||||
The function layer has its own `DocumentFunctionRegistry`. Functions are
|
||||
described by `DocumentFunctionDescriptor`:
|
||||
|
||||
- stable id and namespace
|
||||
- parameters
|
||||
- output type
|
||||
- execution kind
|
||||
- capability declarations
|
||||
- safety metadata
|
||||
- examples
|
||||
|
||||
The built-in extension catalog exposes this layer as `document.function` with
|
||||
kind `document-function`. This keeps it discoverable without replacing
|
||||
processors, workflows, references, contracts, templates, or query engines.
|
||||
|
||||
## Policy And Capability Gates
|
||||
|
||||
The first evaluator blocks non-deterministic functions and supports local
|
||||
capability blocking through `ProcessingContext.policy`, for example:
|
||||
|
||||
```python
|
||||
ProcessingContext(policy={"blocked_capabilities": ["document_function"]})
|
||||
```
|
||||
|
||||
Future functions that read files, access network resources, invoke external
|
||||
processes, render exports, or call assisted generation must declare those
|
||||
capabilities before execution. External policy services may provide decisions
|
||||
through adapters later, but deterministic function execution has no external
|
||||
service dependency.
|
||||
|
||||
## Design Rules
|
||||
|
||||
- Stay close to Markdown and preserve CommonMark documents unless function
|
||||
syntax is explicit.
|
||||
- Keep deterministic execution useful without backends or providers.
|
||||
- Surface diagnostics instead of silently deleting failed calls.
|
||||
- Preserve source line information where available.
|
||||
- Treat functions as an authoring surface over existing capabilities, not as a
|
||||
second workflow engine.
|
||||
@@ -39,7 +39,7 @@ and descriptions mirror the operational view.
|
||||
| `MKTT-WP-0011` | complete | done | `MKTT-WP-0003`; task-level triggers: `MKTT-WP-0010-T001`, `MKTT-WP-0010-T005` | Markdown dataflow workflow layer is complete: workflow standard, source collectors, binding model, deterministic steps, assisted boundary, safe outputs, CLI, docs, and examples. |
|
||||
| `MKTT-WP-0009` | complete | done | `MKTT-WP-0006` | Access-controlled knowledge gateway is complete: local labels, trust zones, path rules, policy-aware cache query/search, decisions, diagnostics, and external adapter boundaries. |
|
||||
| `MKTT-WP-0014` | complete | done | `MKTT-WP-0009` | Markitect-side enterprise IAM access-control integration is complete: NetKingdom/key-cape-compatible identity claims, flex-auth resource/policy contract, directory group resolution fixtures, decision-log sink, workflow declarations, CLI commands, and external PDP request examples. |
|
||||
| `MKTT-WP-0012` | P3 | todo | `MKTT-WP-0004`, `MKTT-WP-0010`, `MKTT-WP-0011` | Future Quarkdown-inspired document function layer: reusable Markdown-native function calls over processors, references, contracts, workflows, and later assisted steps. |
|
||||
| `MKTT-WP-0012` | complete | done | `MKTT-WP-0004`, `MKTT-WP-0010`, `MKTT-WP-0011` | Document function layer is complete: deterministic Markdown-native function descriptors, registry, inline/fenced syntax, pipelines, context bindings, CLI, docs, examples, diagnostics, provenance, and extension descriptor. |
|
||||
| `MKTT-WP-0008` | P3 | todo | `MKTT-WP-0006`, `MKTT-WP-0007`, `MKTT-WP-0009` | Agent working-memory cache after backend and policy floor are available. |
|
||||
|
||||
## Dependency Notes
|
||||
@@ -69,11 +69,10 @@ runtime/workflow expansion because it reduces central wiring and gives future
|
||||
features a canonical processing context/result/diagnostic/provenance model. It
|
||||
is not a business dataflow layer; that remains `MKTT-WP-0011`.
|
||||
|
||||
`MKTT-WP-0012` captures the Quarkdown-inspired document function layer. It
|
||||
should follow `MKTT-WP-0011` because the workflow layer will reveal which
|
||||
operations deserve author-facing function syntax. It should remain optional and
|
||||
capability-gated, especially before assisted, external, file, or network
|
||||
functions are allowed.
|
||||
`MKTT-WP-0012` completed the Quarkdown-inspired document function layer as a
|
||||
deterministic authoring surface over existing Markitect capabilities. Assisted,
|
||||
external, file, network, render/export, and provider-backed functions remain
|
||||
future optional extensions behind local capability and policy gates.
|
||||
|
||||
`MKTT-WP-0014` completed Markitect-side enterprise IAM integration for the
|
||||
access-control gateway. Central authorization administration remains optional
|
||||
|
||||
15
examples/functions/basic-functions.md
Normal file
15
examples/functions/basic-functions.md
Normal file
@@ -0,0 +1,15 @@
|
||||
# Basic Document Functions
|
||||
|
||||
Status: {{mkt:text.upper "draft"}}
|
||||
|
||||
{{mkt:md.bold "Important"}} sections can use inline functions.
|
||||
|
||||
Pipeline result: {{mkt:text.upper "draft" | text.replace DRAFT Final}}
|
||||
|
||||
```mkt-function md.heading level=2
|
||||
Generated Section
|
||||
```
|
||||
|
||||
```mkt-function md.codeblock lang=python
|
||||
print("hello from a deterministic function")
|
||||
```
|
||||
@@ -20,6 +20,19 @@ from markitect_tool.contract import (
|
||||
validate_contract,
|
||||
validate_contract_file,
|
||||
)
|
||||
from markitect_tool.document_function import (
|
||||
DocumentFunctionCall,
|
||||
DocumentFunctionDescriptor,
|
||||
DocumentFunctionError,
|
||||
DocumentFunctionEvaluationResult,
|
||||
DocumentFunctionParameter,
|
||||
DocumentFunctionRegistry,
|
||||
DocumentFunctionRun,
|
||||
default_document_function_registry,
|
||||
parse_document_function_calls,
|
||||
render_document_functions,
|
||||
validate_document_functions,
|
||||
)
|
||||
from markitect_tool.cache import (
|
||||
CacheEntry,
|
||||
CacheManifest,
|
||||
@@ -220,6 +233,17 @@ __all__ = [
|
||||
"load_contract_file",
|
||||
"validate_contract",
|
||||
"validate_contract_file",
|
||||
"DocumentFunctionCall",
|
||||
"DocumentFunctionDescriptor",
|
||||
"DocumentFunctionError",
|
||||
"DocumentFunctionEvaluationResult",
|
||||
"DocumentFunctionParameter",
|
||||
"DocumentFunctionRegistry",
|
||||
"DocumentFunctionRun",
|
||||
"default_document_function_registry",
|
||||
"parse_document_function_calls",
|
||||
"render_document_functions",
|
||||
"validate_document_functions",
|
||||
"CacheEntry",
|
||||
"CacheManifest",
|
||||
"CacheStatus",
|
||||
|
||||
@@ -37,6 +37,13 @@ from markitect_tool.contract import (
|
||||
load_contract_file,
|
||||
validate_contract,
|
||||
)
|
||||
from markitect_tool.document_function import (
|
||||
DocumentFunctionError,
|
||||
default_document_function_registry,
|
||||
render_document_functions,
|
||||
validate_document_functions,
|
||||
)
|
||||
from markitect_tool.extension import ProcessingContext
|
||||
from markitect_tool.explode import (
|
||||
ExplodeError,
|
||||
explode_markdown_file,
|
||||
@@ -858,6 +865,77 @@ def policy_resource_manifest(manifest_file: Path, output_format: str) -> None:
|
||||
_emit_resource_manifest_result({"manifest": manifest.to_dict()}, output_format)
|
||||
|
||||
|
||||
@main.group("function")
|
||||
def function_group() -> None:
|
||||
"""Inspect and execute deterministic document functions."""
|
||||
|
||||
|
||||
@function_group.command("list")
|
||||
@click.option("--namespace", help="Only list functions in one namespace.")
|
||||
@click.option(
|
||||
"--format",
|
||||
"output_format",
|
||||
type=click.Choice(["json", "yaml", "text"], case_sensitive=False),
|
||||
default="text",
|
||||
show_default=True,
|
||||
)
|
||||
def function_list(namespace: str | None, output_format: str) -> None:
|
||||
"""List registered document functions."""
|
||||
|
||||
registry = default_document_function_registry()
|
||||
functions = [descriptor.to_dict() for descriptor in registry.list(namespace=namespace)]
|
||||
_emit_function_catalog({"count": len(functions), "functions": functions}, output_format)
|
||||
|
||||
|
||||
@function_group.command("render")
|
||||
@click.argument("file", type=click.Path(exists=True, dir_okay=False, path_type=Path))
|
||||
@click.option(
|
||||
"--format",
|
||||
"output_format",
|
||||
type=click.Choice(["json", "yaml", "text"], case_sensitive=False),
|
||||
default="text",
|
||||
show_default=True,
|
||||
)
|
||||
def function_render(file: Path, output_format: str) -> None:
|
||||
"""Render deterministic document function calls in a Markdown file."""
|
||||
|
||||
try:
|
||||
text = file.read_text(encoding="utf-8")
|
||||
result = render_document_functions(text, context=ProcessingContext(source_path=file))
|
||||
except DocumentFunctionError as exc:
|
||||
raise click.ClickException(str(exc)) from exc
|
||||
_emit_function_result(result.to_dict(), output_format)
|
||||
raise click.exceptions.Exit(0 if result.valid else 1)
|
||||
|
||||
|
||||
@function_group.command("check")
|
||||
@click.argument("file", type=click.Path(exists=True, dir_okay=False, path_type=Path))
|
||||
@click.option("--allow", "allowed", multiple=True, help="Only allow this function id. May be repeated.")
|
||||
@click.option("--forbid", "forbidden", multiple=True, help="Forbid this function id. May be repeated.")
|
||||
@click.option(
|
||||
"--format",
|
||||
"output_format",
|
||||
type=click.Choice(["json", "yaml", "text"], case_sensitive=False),
|
||||
default="text",
|
||||
show_default=True,
|
||||
)
|
||||
def function_check(
|
||||
file: Path,
|
||||
allowed: tuple[str, ...],
|
||||
forbidden: tuple[str, ...],
|
||||
output_format: str,
|
||||
) -> None:
|
||||
"""Validate document function calls without rendering."""
|
||||
|
||||
try:
|
||||
text = file.read_text(encoding="utf-8")
|
||||
result = validate_document_functions(text, allowed=list(allowed), forbidden=list(forbidden))
|
||||
except DocumentFunctionError as exc:
|
||||
raise click.ClickException(str(exc)) from exc
|
||||
_emit_function_check_result(result.to_dict(), output_format)
|
||||
raise click.exceptions.Exit(0 if result.valid else 1)
|
||||
|
||||
|
||||
@main.group("class")
|
||||
def class_group() -> None:
|
||||
"""Resolve deterministic content classes."""
|
||||
@@ -1831,6 +1909,39 @@ def _emit_resource_manifest_result(data: dict, output_format: str) -> None:
|
||||
click.echo(f"actions: {actions}")
|
||||
|
||||
|
||||
def _emit_function_catalog(data: dict, output_format: str) -> None:
|
||||
if output_format == "json":
|
||||
click.echo(json.dumps(data, indent=2, ensure_ascii=False))
|
||||
elif output_format == "yaml":
|
||||
click.echo(yaml.safe_dump(data, sort_keys=False))
|
||||
else:
|
||||
for function in data.get("functions", []):
|
||||
click.echo(f"{function['id']}: {function.get('summary', '')}")
|
||||
|
||||
|
||||
def _emit_function_result(data: dict, output_format: str) -> None:
|
||||
if output_format == "json":
|
||||
click.echo(json.dumps(data, indent=2, ensure_ascii=False))
|
||||
elif output_format == "yaml":
|
||||
click.echo(yaml.safe_dump(data, sort_keys=False))
|
||||
else:
|
||||
click.echo(data.get("content", ""))
|
||||
for diagnostic in data.get("diagnostics", []):
|
||||
click.echo(f"[{diagnostic['severity']}] {diagnostic['code']}: {diagnostic['message']}")
|
||||
|
||||
|
||||
def _emit_function_check_result(data: dict, output_format: str) -> None:
|
||||
if output_format == "json":
|
||||
click.echo(json.dumps(data, indent=2, ensure_ascii=False))
|
||||
elif output_format == "yaml":
|
||||
click.echo(yaml.safe_dump(data, sort_keys=False))
|
||||
else:
|
||||
click.echo("valid" if data.get("valid") else "invalid")
|
||||
click.echo(f"functions: {len(data.get('calls', []))}")
|
||||
for diagnostic in data.get("diagnostics", []):
|
||||
click.echo(f"- [{diagnostic['severity']}] {diagnostic['code']}: {diagnostic['message']}")
|
||||
|
||||
|
||||
def _emit_metrics(data: dict, output_format: str) -> None:
|
||||
if output_format == "json":
|
||||
click.echo(json.dumps(data, indent=2, ensure_ascii=False))
|
||||
|
||||
791
src/markitect_tool/document_function.py
Normal file
791
src/markitect_tool/document_function.py
Normal file
@@ -0,0 +1,791 @@
|
||||
"""Markdown-native deterministic document functions."""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
import re
|
||||
import shlex
|
||||
from dataclasses import asdict, dataclass, field
|
||||
from typing import Any, Callable
|
||||
|
||||
from markitect_tool.diagnostics import Diagnostic, SourceLocation, has_error
|
||||
from markitect_tool.extension import (
|
||||
ProcessingCapability,
|
||||
ProcessingContext,
|
||||
ProcessingProvenance,
|
||||
ProcessingResult,
|
||||
ProcessingTrace,
|
||||
)
|
||||
|
||||
|
||||
INLINE_CALL_RE = re.compile(r"\{\{mkt:(?P<body>.+?)\}\}", re.DOTALL)
|
||||
FENCE_CALL_RE = re.compile(
|
||||
r"```(?P<info>[^\n`]*)\n(?P<body>.*?)\n```",
|
||||
re.DOTALL,
|
||||
)
|
||||
|
||||
FunctionImplementation = Callable[..., Any]
|
||||
|
||||
|
||||
class DocumentFunctionError(ValueError):
|
||||
"""Raised when document function parsing or evaluation fails."""
|
||||
|
||||
|
||||
@dataclass(frozen=True)
|
||||
class DocumentFunctionParameter:
|
||||
"""One declared document function parameter."""
|
||||
|
||||
name: str
|
||||
kind: str = "string"
|
||||
required: bool = True
|
||||
default: Any = None
|
||||
variadic: bool = False
|
||||
description: str | None = None
|
||||
|
||||
def to_dict(self) -> dict[str, Any]:
|
||||
return _drop_empty(asdict(self))
|
||||
|
||||
|
||||
@dataclass(frozen=True)
|
||||
class DocumentFunctionDescriptor:
|
||||
"""Inspectable descriptor for a document function."""
|
||||
|
||||
id: str
|
||||
summary: str
|
||||
parameters: list[DocumentFunctionParameter] = field(default_factory=list)
|
||||
output_type: str = "markdown"
|
||||
execution: str = "deterministic"
|
||||
capabilities: list[ProcessingCapability] = field(default_factory=list)
|
||||
safety: dict[str, Any] = field(default_factory=dict)
|
||||
examples: list[str] = field(default_factory=list)
|
||||
metadata: dict[str, Any] = field(default_factory=dict)
|
||||
implementation: FunctionImplementation | None = field(default=None, compare=False, repr=False)
|
||||
|
||||
@property
|
||||
def namespace(self) -> str:
|
||||
return self.id.split(".", 1)[0] if "." in self.id else "default"
|
||||
|
||||
def to_dict(self) -> dict[str, Any]:
|
||||
return _drop_empty(
|
||||
{
|
||||
"id": self.id,
|
||||
"namespace": self.namespace,
|
||||
"summary": self.summary,
|
||||
"parameters": [parameter.to_dict() for parameter in self.parameters],
|
||||
"output_type": self.output_type,
|
||||
"execution": self.execution,
|
||||
"capabilities": [capability.to_dict() for capability in self.capabilities],
|
||||
"safety": self.safety,
|
||||
"examples": self.examples,
|
||||
"metadata": self.metadata,
|
||||
}
|
||||
)
|
||||
|
||||
|
||||
@dataclass(frozen=True)
|
||||
class DocumentFunctionCall:
|
||||
"""Parsed document function call."""
|
||||
|
||||
function_id: str
|
||||
args: list[Any] = field(default_factory=list)
|
||||
kwargs: dict[str, Any] = field(default_factory=dict)
|
||||
body: str | None = None
|
||||
raw: str = ""
|
||||
inline: bool = True
|
||||
line: int | None = None
|
||||
pipeline: list["DocumentFunctionCall"] = field(default_factory=list)
|
||||
|
||||
def to_dict(self) -> dict[str, Any]:
|
||||
data = asdict(self)
|
||||
data["pipeline"] = [call.to_dict() for call in self.pipeline]
|
||||
return _drop_empty(data)
|
||||
|
||||
|
||||
@dataclass(frozen=True)
|
||||
class DocumentFunctionRun:
|
||||
"""One function call result."""
|
||||
|
||||
call: DocumentFunctionCall
|
||||
output: Any = None
|
||||
diagnostics: list[Diagnostic] = field(default_factory=list)
|
||||
provenance: list[ProcessingProvenance] = field(default_factory=list)
|
||||
trace: list[ProcessingTrace] = field(default_factory=list)
|
||||
|
||||
@property
|
||||
def valid(self) -> bool:
|
||||
return not has_error(self.diagnostics)
|
||||
|
||||
def to_dict(self) -> dict[str, Any]:
|
||||
return _drop_empty(
|
||||
{
|
||||
"call": self.call.to_dict(),
|
||||
"valid": self.valid,
|
||||
"output": self.output,
|
||||
"diagnostics": [diagnostic.to_dict() for diagnostic in self.diagnostics],
|
||||
"provenance": [event.to_dict() for event in self.provenance],
|
||||
"trace": [event.to_dict() for event in self.trace],
|
||||
}
|
||||
)
|
||||
|
||||
|
||||
@dataclass(frozen=True)
|
||||
class DocumentFunctionEvaluationResult:
|
||||
"""Result of expanding document functions in a Markdown document."""
|
||||
|
||||
content: str
|
||||
calls: list[DocumentFunctionRun] = field(default_factory=list)
|
||||
diagnostics: list[Diagnostic] = field(default_factory=list)
|
||||
provenance: list[ProcessingProvenance] = field(default_factory=list)
|
||||
trace: list[ProcessingTrace] = field(default_factory=list)
|
||||
|
||||
@property
|
||||
def valid(self) -> bool:
|
||||
return not has_error(self.diagnostics)
|
||||
|
||||
def to_processing_result(self) -> ProcessingResult:
|
||||
return ProcessingResult(
|
||||
output={"content": self.content},
|
||||
diagnostics=self.diagnostics,
|
||||
provenance=self.provenance,
|
||||
trace=self.trace,
|
||||
metadata={"calls": [run.call.to_dict() for run in self.calls]},
|
||||
)
|
||||
|
||||
def to_dict(self) -> dict[str, Any]:
|
||||
return _drop_empty(
|
||||
{
|
||||
"valid": self.valid,
|
||||
"content": self.content,
|
||||
"calls": [run.to_dict() for run in self.calls],
|
||||
"diagnostics": [diagnostic.to_dict() for diagnostic in self.diagnostics],
|
||||
"provenance": [event.to_dict() for event in self.provenance],
|
||||
"trace": [event.to_dict() for event in self.trace],
|
||||
}
|
||||
)
|
||||
|
||||
|
||||
class DocumentFunctionRegistry:
|
||||
"""Registry and evaluator for document functions."""
|
||||
|
||||
def __init__(
|
||||
self,
|
||||
descriptors: list[DocumentFunctionDescriptor] | None = None,
|
||||
) -> None:
|
||||
self._descriptors: dict[str, DocumentFunctionDescriptor] = {}
|
||||
for descriptor in descriptors or []:
|
||||
self.register(descriptor)
|
||||
|
||||
def register(self, descriptor: DocumentFunctionDescriptor) -> None:
|
||||
if descriptor.id in self._descriptors:
|
||||
raise DocumentFunctionError(f"Duplicate document function `{descriptor.id}`")
|
||||
if descriptor.implementation is None:
|
||||
raise DocumentFunctionError(f"Document function `{descriptor.id}` has no implementation")
|
||||
self._descriptors[descriptor.id] = descriptor
|
||||
|
||||
def get(self, function_id: str) -> DocumentFunctionDescriptor:
|
||||
try:
|
||||
return self._descriptors[function_id]
|
||||
except KeyError as exc:
|
||||
raise DocumentFunctionError(f"Unknown document function `{function_id}`") from exc
|
||||
|
||||
def list(self, *, namespace: str | None = None) -> list[DocumentFunctionDescriptor]:
|
||||
descriptors = [self._descriptors[key] for key in sorted(self._descriptors)]
|
||||
if namespace is not None:
|
||||
return [descriptor for descriptor in descriptors if descriptor.namespace == namespace]
|
||||
return descriptors
|
||||
|
||||
def to_dict(self) -> dict[str, Any]:
|
||||
return {
|
||||
"count": len(self._descriptors),
|
||||
"functions": [descriptor.to_dict() for descriptor in self.list()],
|
||||
}
|
||||
|
||||
def evaluate_call(
|
||||
self,
|
||||
call: DocumentFunctionCall,
|
||||
*,
|
||||
context: ProcessingContext | None = None,
|
||||
) -> DocumentFunctionRun:
|
||||
context = context or ProcessingContext()
|
||||
output: Any = None
|
||||
diagnostics: list[Diagnostic] = []
|
||||
provenance: list[ProcessingProvenance] = []
|
||||
trace: list[ProcessingTrace] = []
|
||||
calls = [call, *call.pipeline]
|
||||
|
||||
for index, current in enumerate(calls):
|
||||
if index > 0:
|
||||
current = DocumentFunctionCall(
|
||||
function_id=current.function_id,
|
||||
args=[output, *current.args],
|
||||
kwargs=current.kwargs,
|
||||
body=current.body,
|
||||
raw=current.raw,
|
||||
inline=current.inline,
|
||||
line=current.line,
|
||||
)
|
||||
run = self._evaluate_single(current, context=context)
|
||||
diagnostics.extend(run.diagnostics)
|
||||
provenance.extend(run.provenance)
|
||||
trace.extend(run.trace)
|
||||
if not run.valid:
|
||||
output = current.raw
|
||||
break
|
||||
output = run.output
|
||||
|
||||
return DocumentFunctionRun(
|
||||
call=call,
|
||||
output=output,
|
||||
diagnostics=diagnostics,
|
||||
provenance=provenance,
|
||||
trace=trace,
|
||||
)
|
||||
|
||||
def _evaluate_single(
|
||||
self,
|
||||
call: DocumentFunctionCall,
|
||||
*,
|
||||
context: ProcessingContext,
|
||||
) -> DocumentFunctionRun:
|
||||
try:
|
||||
descriptor = self.get(call.function_id)
|
||||
except DocumentFunctionError as exc:
|
||||
return _call_error(call, "function.unknown", str(exc), context)
|
||||
|
||||
if descriptor.execution != "deterministic":
|
||||
return _call_error(
|
||||
call,
|
||||
"function.execution_blocked",
|
||||
f"Function `{descriptor.id}` is `{descriptor.execution}` and is not enabled.",
|
||||
context,
|
||||
details={"execution": descriptor.execution},
|
||||
)
|
||||
blocked = _blocked_capabilities(descriptor, context)
|
||||
if blocked:
|
||||
return _call_error(
|
||||
call,
|
||||
"function.capability_blocked",
|
||||
f"Function `{descriptor.id}` requires blocked capabilities {blocked}.",
|
||||
context,
|
||||
details={"capabilities": blocked},
|
||||
)
|
||||
|
||||
try:
|
||||
args = [_resolve_value(arg, context) for arg in call.args]
|
||||
kwargs = {key: _resolve_value(value, context) for key, value in call.kwargs.items()}
|
||||
if call.body is not None:
|
||||
kwargs.setdefault("body", _resolve_value(call.body, context))
|
||||
_validate_arguments(descriptor, args, kwargs)
|
||||
if descriptor.id == "data.get":
|
||||
output = context.variables.get(str(args[0]), kwargs.get("default", ""))
|
||||
raise _FunctionOutputReady(output)
|
||||
assert descriptor.implementation is not None
|
||||
output = descriptor.implementation(*args, **kwargs)
|
||||
except _FunctionOutputReady as ready:
|
||||
output = ready.output
|
||||
except Exception as exc:
|
||||
return _call_error(call, "function.evaluation_failed", str(exc), context)
|
||||
|
||||
provenance = [
|
||||
ProcessingProvenance(
|
||||
operation=f"document_function.{descriptor.id}",
|
||||
source_path=str(context.source_path) if context.source_path else None,
|
||||
metadata={
|
||||
"function": descriptor.id,
|
||||
"execution": descriptor.execution,
|
||||
"output_type": descriptor.output_type,
|
||||
},
|
||||
)
|
||||
]
|
||||
trace = [
|
||||
ProcessingTrace(
|
||||
event="document_function.executed",
|
||||
metadata={"function": descriptor.id, "line": call.line},
|
||||
)
|
||||
]
|
||||
return DocumentFunctionRun(call=call, output=output, provenance=provenance, trace=trace)
|
||||
|
||||
|
||||
def default_document_function_registry() -> DocumentFunctionRegistry:
|
||||
"""Return built-in deterministic document functions."""
|
||||
|
||||
return DocumentFunctionRegistry(
|
||||
[
|
||||
_descriptor(
|
||||
"text.upper",
|
||||
"Uppercase text.",
|
||||
_text_upper,
|
||||
[DocumentFunctionParameter("value")],
|
||||
examples=['{{mkt:text.upper "draft"}}'],
|
||||
),
|
||||
_descriptor(
|
||||
"text.lower",
|
||||
"Lowercase text.",
|
||||
_text_lower,
|
||||
[DocumentFunctionParameter("value")],
|
||||
examples=['{{mkt:text.lower "DRAFT"}}'],
|
||||
),
|
||||
_descriptor(
|
||||
"text.title",
|
||||
"Title-case text.",
|
||||
_text_title,
|
||||
[DocumentFunctionParameter("value")],
|
||||
examples=['{{mkt:text.title "release notes"}}'],
|
||||
),
|
||||
_descriptor(
|
||||
"text.trim",
|
||||
"Trim surrounding whitespace.",
|
||||
_text_trim,
|
||||
[DocumentFunctionParameter("value")],
|
||||
examples=['{{mkt:text.trim " ok "}}'],
|
||||
),
|
||||
_descriptor(
|
||||
"text.replace",
|
||||
"Replace text deterministically.",
|
||||
_text_replace,
|
||||
[
|
||||
DocumentFunctionParameter("value"),
|
||||
DocumentFunctionParameter("old"),
|
||||
DocumentFunctionParameter("new"),
|
||||
],
|
||||
examples=['{{mkt:text.replace "draft" draft final}}'],
|
||||
),
|
||||
_descriptor(
|
||||
"text.join",
|
||||
"Join text values.",
|
||||
_text_join,
|
||||
[
|
||||
DocumentFunctionParameter("items", variadic=True),
|
||||
DocumentFunctionParameter("sep", required=False, default=""),
|
||||
],
|
||||
examples=['{{mkt:text.join "A" "B" sep=", "}}'],
|
||||
),
|
||||
_descriptor(
|
||||
"md.heading",
|
||||
"Create a Markdown heading.",
|
||||
_md_heading,
|
||||
[
|
||||
DocumentFunctionParameter("text", required=False),
|
||||
DocumentFunctionParameter("level", kind="integer", required=False, default=2),
|
||||
DocumentFunctionParameter("body", required=False),
|
||||
],
|
||||
examples=['{{mkt:md.heading text="Decision" level=2}}'],
|
||||
),
|
||||
_descriptor(
|
||||
"md.bold",
|
||||
"Create Markdown bold text.",
|
||||
_md_bold,
|
||||
[DocumentFunctionParameter("text")],
|
||||
examples=['{{mkt:md.bold "Important"}}'],
|
||||
),
|
||||
_descriptor(
|
||||
"md.link",
|
||||
"Create a Markdown link.",
|
||||
_md_link,
|
||||
[DocumentFunctionParameter("text"), DocumentFunctionParameter("url")],
|
||||
examples=['{{mkt:md.link "OpenAI" "https://openai.com"}}'],
|
||||
),
|
||||
_descriptor(
|
||||
"md.codeblock",
|
||||
"Create a fenced Markdown code block.",
|
||||
_md_codeblock,
|
||||
[
|
||||
DocumentFunctionParameter("body", required=False),
|
||||
DocumentFunctionParameter("lang", required=False, default=""),
|
||||
],
|
||||
examples=["```mkt-function md.codeblock lang=python\nprint('hi')\n```"],
|
||||
),
|
||||
_descriptor(
|
||||
"data.get",
|
||||
"Read a value from processing context variables.",
|
||||
_data_get,
|
||||
[DocumentFunctionParameter("key"), DocumentFunctionParameter("default", required=False, default="")],
|
||||
examples=["{{mkt:data.get title}}"],
|
||||
),
|
||||
]
|
||||
)
|
||||
|
||||
|
||||
def parse_document_function_calls(text: str) -> list[DocumentFunctionCall]:
|
||||
"""Parse inline and fenced document function calls."""
|
||||
|
||||
calls: list[DocumentFunctionCall] = []
|
||||
for match in INLINE_CALL_RE.finditer(text):
|
||||
line = _line_for_offset(text, match.start())
|
||||
calls.append(_parse_call_expression(match.group("body"), raw=match.group(0), inline=True, line=line))
|
||||
|
||||
for match in FENCE_CALL_RE.finditer(text):
|
||||
info = match.group("info").strip()
|
||||
tokens = info.split(None, 1)
|
||||
if not tokens or tokens[0] not in {"mkt-function", "markitect-function", "function"}:
|
||||
continue
|
||||
expression = tokens[1] if len(tokens) > 1 else ""
|
||||
line = _line_for_offset(text, match.start())
|
||||
calls.append(
|
||||
_parse_call_expression(
|
||||
expression,
|
||||
raw=match.group(0),
|
||||
inline=False,
|
||||
line=line,
|
||||
body=match.group("body"),
|
||||
)
|
||||
)
|
||||
return calls
|
||||
|
||||
|
||||
def render_document_functions(
|
||||
text: str,
|
||||
*,
|
||||
registry: DocumentFunctionRegistry | None = None,
|
||||
context: ProcessingContext | None = None,
|
||||
) -> DocumentFunctionEvaluationResult:
|
||||
"""Expand deterministic document functions in Markdown content."""
|
||||
|
||||
registry = registry or default_document_function_registry()
|
||||
context = context or ProcessingContext()
|
||||
runs: list[DocumentFunctionRun] = []
|
||||
diagnostics: list[Diagnostic] = []
|
||||
provenance: list[ProcessingProvenance] = []
|
||||
trace: list[ProcessingTrace] = []
|
||||
|
||||
def replace_inline(match: re.Match[str]) -> str:
|
||||
call = _parse_call_expression(
|
||||
match.group("body"),
|
||||
raw=match.group(0),
|
||||
inline=True,
|
||||
line=_line_for_offset(text, match.start()),
|
||||
)
|
||||
run = registry.evaluate_call(call, context=context)
|
||||
runs.append(run)
|
||||
diagnostics.extend(run.diagnostics)
|
||||
provenance.extend(run.provenance)
|
||||
trace.extend(run.trace)
|
||||
if not run.valid:
|
||||
return match.group(0)
|
||||
return _format_function_output(run.output, inline=True)
|
||||
|
||||
content = INLINE_CALL_RE.sub(replace_inline, text)
|
||||
|
||||
def replace_fence(match: re.Match[str]) -> str:
|
||||
info = match.group("info").strip()
|
||||
tokens = info.split(None, 1)
|
||||
if not tokens or tokens[0] not in {"mkt-function", "markitect-function", "function"}:
|
||||
return match.group(0)
|
||||
call = _parse_call_expression(
|
||||
tokens[1] if len(tokens) > 1 else "",
|
||||
raw=match.group(0),
|
||||
inline=False,
|
||||
line=_line_for_offset(text, match.start()),
|
||||
body=match.group("body"),
|
||||
)
|
||||
run = registry.evaluate_call(call, context=context)
|
||||
runs.append(run)
|
||||
diagnostics.extend(run.diagnostics)
|
||||
provenance.extend(run.provenance)
|
||||
trace.extend(run.trace)
|
||||
if not run.valid:
|
||||
return match.group(0)
|
||||
return _format_function_output(run.output, inline=False)
|
||||
|
||||
content = FENCE_CALL_RE.sub(replace_fence, content)
|
||||
trace.append(ProcessingTrace(event="document_function.rendered", metadata={"calls": len(runs)}))
|
||||
return DocumentFunctionEvaluationResult(
|
||||
content=content,
|
||||
calls=runs,
|
||||
diagnostics=diagnostics,
|
||||
provenance=provenance,
|
||||
trace=trace,
|
||||
)
|
||||
|
||||
|
||||
def validate_document_functions(
|
||||
text: str,
|
||||
*,
|
||||
registry: DocumentFunctionRegistry | None = None,
|
||||
allowed: list[str] | None = None,
|
||||
forbidden: list[str] | None = None,
|
||||
) -> DocumentFunctionEvaluationResult:
|
||||
"""Validate function calls without rendering the document."""
|
||||
|
||||
registry = registry or default_document_function_registry()
|
||||
allowed_set = set(allowed or [])
|
||||
forbidden_set = set(forbidden or [])
|
||||
diagnostics: list[Diagnostic] = []
|
||||
runs: list[DocumentFunctionRun] = []
|
||||
for call in parse_document_function_calls(text):
|
||||
if allowed_set and call.function_id not in allowed_set:
|
||||
diagnostics.append(_diagnostic(call, "function.not_allowed", f"Function `{call.function_id}` is not allowed."))
|
||||
if call.function_id in forbidden_set:
|
||||
diagnostics.append(_diagnostic(call, "function.forbidden", f"Function `{call.function_id}` is forbidden."))
|
||||
try:
|
||||
descriptor = registry.get(call.function_id)
|
||||
if descriptor.execution != "deterministic":
|
||||
diagnostics.append(
|
||||
_diagnostic(
|
||||
call,
|
||||
"function.unstable",
|
||||
f"Function `{call.function_id}` is `{descriptor.execution}` and cannot run in deterministic contexts.",
|
||||
)
|
||||
)
|
||||
except DocumentFunctionError as exc:
|
||||
diagnostics.append(_diagnostic(call, "function.unknown", str(exc)))
|
||||
runs.append(DocumentFunctionRun(call=call))
|
||||
return DocumentFunctionEvaluationResult(content=text, calls=runs, diagnostics=diagnostics)
|
||||
|
||||
|
||||
def _parse_call_expression(
|
||||
expression: str,
|
||||
*,
|
||||
raw: str,
|
||||
inline: bool,
|
||||
line: int | None,
|
||||
body: str | None = None,
|
||||
) -> DocumentFunctionCall:
|
||||
pipeline_parts = [part.strip() for part in expression.split("|") if part.strip()]
|
||||
if not pipeline_parts:
|
||||
raise DocumentFunctionError("Document function call is empty.")
|
||||
first = _parse_single_call(pipeline_parts[0], raw=raw, inline=inline, line=line, body=body)
|
||||
pipeline = [
|
||||
_parse_single_call(part, raw=part, inline=inline, line=line)
|
||||
for part in pipeline_parts[1:]
|
||||
]
|
||||
return DocumentFunctionCall(
|
||||
function_id=first.function_id,
|
||||
args=first.args,
|
||||
kwargs=first.kwargs,
|
||||
body=first.body,
|
||||
raw=raw,
|
||||
inline=inline,
|
||||
line=line,
|
||||
pipeline=pipeline,
|
||||
)
|
||||
|
||||
|
||||
def _parse_single_call(
|
||||
expression: str,
|
||||
*,
|
||||
raw: str,
|
||||
inline: bool,
|
||||
line: int | None,
|
||||
body: str | None = None,
|
||||
) -> DocumentFunctionCall:
|
||||
try:
|
||||
parts = shlex.split(expression)
|
||||
except ValueError as exc:
|
||||
raise DocumentFunctionError(f"Invalid function syntax: {exc}") from exc
|
||||
if not parts:
|
||||
raise DocumentFunctionError("Document function call is empty.")
|
||||
function_id = parts[0]
|
||||
args: list[Any] = []
|
||||
kwargs: dict[str, Any] = {}
|
||||
for token in parts[1:]:
|
||||
if "=" in token and not token.startswith("="):
|
||||
key, value = token.split("=", 1)
|
||||
kwargs[key.replace("-", "_")] = _parse_literal(value)
|
||||
else:
|
||||
args.append(_parse_literal(token))
|
||||
return DocumentFunctionCall(
|
||||
function_id=function_id,
|
||||
args=args,
|
||||
kwargs=kwargs,
|
||||
body=body,
|
||||
raw=raw,
|
||||
inline=inline,
|
||||
line=line,
|
||||
)
|
||||
|
||||
|
||||
def _descriptor(
|
||||
function_id: str,
|
||||
summary: str,
|
||||
implementation: FunctionImplementation,
|
||||
parameters: list[DocumentFunctionParameter],
|
||||
*,
|
||||
output_type: str = "markdown",
|
||||
examples: list[str] | None = None,
|
||||
) -> DocumentFunctionDescriptor:
|
||||
return DocumentFunctionDescriptor(
|
||||
id=function_id,
|
||||
summary=summary,
|
||||
parameters=parameters,
|
||||
output_type=output_type,
|
||||
capabilities=[
|
||||
ProcessingCapability(id="document_function", kind="execute"),
|
||||
ProcessingCapability(id="deterministic", kind="execution"),
|
||||
],
|
||||
safety={"network": False, "filesystem": False, "assisted_generation": False},
|
||||
examples=examples or [],
|
||||
implementation=implementation,
|
||||
)
|
||||
|
||||
|
||||
def _validate_arguments(
|
||||
descriptor: DocumentFunctionDescriptor,
|
||||
args: list[Any],
|
||||
kwargs: dict[str, Any],
|
||||
) -> None:
|
||||
required = [parameter for parameter in descriptor.parameters if parameter.required and not parameter.variadic]
|
||||
positional = [parameter for parameter in descriptor.parameters if not parameter.variadic]
|
||||
variadic = next((parameter for parameter in descriptor.parameters if parameter.variadic), None)
|
||||
if len(args) > len(positional) and variadic is None:
|
||||
raise DocumentFunctionError(f"Function `{descriptor.id}` received too many positional arguments.")
|
||||
for index, parameter in enumerate(required):
|
||||
if index < len(args) or parameter.name in kwargs:
|
||||
continue
|
||||
raise DocumentFunctionError(f"Function `{descriptor.id}` requires `{parameter.name}`.")
|
||||
|
||||
|
||||
def _blocked_capabilities(
|
||||
descriptor: DocumentFunctionDescriptor,
|
||||
context: ProcessingContext,
|
||||
) -> list[str]:
|
||||
blocked = []
|
||||
policy = context.policy or {}
|
||||
blocked_ids = set(policy.get("blocked_capabilities") or [])
|
||||
for capability in descriptor.capabilities:
|
||||
if capability.id in blocked_ids:
|
||||
blocked.append(capability.id)
|
||||
if descriptor.safety.get("network") and policy.get("network") is False:
|
||||
blocked.append("network")
|
||||
if descriptor.safety.get("filesystem") and policy.get("filesystem") is False:
|
||||
blocked.append("filesystem")
|
||||
if descriptor.safety.get("assisted_generation") and policy.get("assisted_generation") is False:
|
||||
blocked.append("assisted_generation")
|
||||
return sorted(set(blocked))
|
||||
|
||||
|
||||
def _resolve_value(value: Any, context: ProcessingContext) -> Any:
|
||||
if isinstance(value, str):
|
||||
if value.startswith("${") and value.endswith("}"):
|
||||
key = value[2:-1].strip()
|
||||
return context.variables.get(key, "")
|
||||
return value
|
||||
|
||||
|
||||
def _format_function_output(value: Any, *, inline: bool) -> str:
|
||||
if isinstance(value, str):
|
||||
return value
|
||||
if isinstance(value, list):
|
||||
return ", ".join(str(item) for item in value) if inline else "\n".join(str(item) for item in value)
|
||||
if isinstance(value, dict):
|
||||
return json.dumps(value, sort_keys=True, ensure_ascii=False)
|
||||
return "" if value is None else str(value)
|
||||
|
||||
|
||||
def _parse_literal(value: str) -> Any:
|
||||
lowered = value.lower()
|
||||
if lowered == "true":
|
||||
return True
|
||||
if lowered == "false":
|
||||
return False
|
||||
if lowered in {"null", "none"}:
|
||||
return None
|
||||
try:
|
||||
return int(value)
|
||||
except ValueError:
|
||||
pass
|
||||
return value
|
||||
|
||||
|
||||
def _call_error(
|
||||
call: DocumentFunctionCall,
|
||||
code: str,
|
||||
message: str,
|
||||
context: ProcessingContext,
|
||||
details: dict[str, Any] | None = None,
|
||||
) -> DocumentFunctionRun:
|
||||
return DocumentFunctionRun(
|
||||
call=call,
|
||||
diagnostics=[
|
||||
Diagnostic(
|
||||
severity="error",
|
||||
code=code,
|
||||
message=message,
|
||||
source=SourceLocation(
|
||||
path=str(context.source_path) if context.source_path else None,
|
||||
line=call.line,
|
||||
)
|
||||
if context.source_path or call.line
|
||||
else None,
|
||||
details=details or {"function": call.function_id},
|
||||
)
|
||||
],
|
||||
)
|
||||
|
||||
|
||||
def _diagnostic(
|
||||
call: DocumentFunctionCall,
|
||||
code: str,
|
||||
message: str,
|
||||
) -> Diagnostic:
|
||||
return Diagnostic(
|
||||
severity="error",
|
||||
code=code,
|
||||
message=message,
|
||||
source=SourceLocation(line=call.line) if call.line else None,
|
||||
details={"function": call.function_id},
|
||||
)
|
||||
|
||||
|
||||
def _line_for_offset(text: str, offset: int) -> int:
|
||||
return text.count("\n", 0, offset) + 1
|
||||
|
||||
|
||||
def _text_upper(value: Any) -> str:
|
||||
return str(value).upper()
|
||||
|
||||
|
||||
def _text_lower(value: Any) -> str:
|
||||
return str(value).lower()
|
||||
|
||||
|
||||
def _text_title(value: Any) -> str:
|
||||
return str(value).title()
|
||||
|
||||
|
||||
def _text_trim(value: Any) -> str:
|
||||
return str(value).strip()
|
||||
|
||||
|
||||
def _text_replace(value: Any, old: Any, new: Any) -> str:
|
||||
return str(value).replace(str(old), str(new))
|
||||
|
||||
|
||||
def _text_join(*items: Any, sep: str = "") -> str:
|
||||
return str(sep).join(str(item) for item in items)
|
||||
|
||||
|
||||
def _md_heading(text: Any = None, *, level: int = 2, body: Any = None) -> str:
|
||||
heading = str(text if text is not None else body if body is not None else "").strip()
|
||||
depth = max(1, min(6, int(level)))
|
||||
return f"{'#' * depth} {heading}"
|
||||
|
||||
|
||||
def _md_bold(text: Any) -> str:
|
||||
return f"**{text}**"
|
||||
|
||||
|
||||
def _md_link(text: Any, url: Any) -> str:
|
||||
return f"[{text}]({url})"
|
||||
|
||||
|
||||
def _md_codeblock(body: Any = "", *, lang: str = "") -> str:
|
||||
info = str(lang).strip()
|
||||
return f"```{info}\n{body}\n```"
|
||||
|
||||
|
||||
def _data_get(key: Any, default: Any = "", *, body: Any = None) -> Any:
|
||||
return body if body is not None else default if str(key).startswith("$") else key
|
||||
|
||||
|
||||
class _FunctionOutputReady(Exception):
|
||||
def __init__(self, output: Any) -> None:
|
||||
self.output = output
|
||||
|
||||
|
||||
def _drop_empty(data: dict[str, Any]) -> dict[str, Any]:
|
||||
return {
|
||||
key: value
|
||||
for key, value in data.items()
|
||||
if value not in (None, [], {}, "")
|
||||
}
|
||||
@@ -18,6 +18,7 @@ def builtin_extension_registry() -> ExtensionRegistry:
|
||||
_runtime_form_state_descriptor(),
|
||||
_runtime_assessment_descriptor(),
|
||||
_local_label_policy_descriptor(),
|
||||
_document_function_descriptor(),
|
||||
]:
|
||||
registry.register(descriptor)
|
||||
return registry
|
||||
@@ -233,3 +234,34 @@ def _local_label_policy_descriptor() -> ExtensionDescriptor:
|
||||
]
|
||||
},
|
||||
)
|
||||
|
||||
|
||||
def _document_function_descriptor() -> ExtensionDescriptor:
|
||||
return ExtensionDescriptor(
|
||||
id="document.function",
|
||||
kind="document-function",
|
||||
summary="Markdown-native deterministic document function registry and evaluator.",
|
||||
capabilities=[
|
||||
ProcessingCapability(id="document_function", kind="execute"),
|
||||
ProcessingCapability(id="deterministic", kind="execution"),
|
||||
ProcessingCapability(id="diagnostics", kind="emit"),
|
||||
ProcessingCapability(id="provenance", kind="emit"),
|
||||
],
|
||||
safety={
|
||||
"network": False,
|
||||
"filesystem": False,
|
||||
"assisted_generation": False,
|
||||
"external_process": False,
|
||||
},
|
||||
input_contract="Markdown with {{mkt:function ...}} or mkt-function fences",
|
||||
output_contract="DocumentFunctionEvaluationResult",
|
||||
diagnostics_namespace="document_function",
|
||||
provenance_prefix="document_function",
|
||||
cli={"commands": ["mkt function list", "mkt function check", "mkt function render"]},
|
||||
docs=["docs/document-functions.md"],
|
||||
examples=["examples/functions/basic-functions.md"],
|
||||
metadata={
|
||||
"execution": "deterministic-only",
|
||||
"external_policy_services_required": False,
|
||||
},
|
||||
)
|
||||
|
||||
@@ -18,6 +18,7 @@ def test_builtin_extension_registry_lists_query_processors_and_backend():
|
||||
assert "runtime.form-state" in ids
|
||||
assert "runtime.assessment" in ids
|
||||
assert "policy.local-label" in ids
|
||||
assert "document.function" in ids
|
||||
|
||||
|
||||
def test_builtin_processor_descriptors_capture_safety_and_provenance():
|
||||
@@ -103,3 +104,22 @@ def test_builtin_policy_descriptor_exposes_cli_and_adapter_boundary():
|
||||
assert "mkt policy resource-manifest" in descriptor.cli["commands"]
|
||||
assert "IdentityClaimsAdapter" in descriptor.metadata["external_adapters"]
|
||||
assert "RelationshipPolicyAdapter" in descriptor.metadata["external_adapters"]
|
||||
|
||||
|
||||
def test_builtin_document_function_descriptor_exposes_deterministic_boundary():
|
||||
registry = builtin_extension_registry()
|
||||
|
||||
descriptor = registry.get("document.function")
|
||||
|
||||
assert descriptor.kind == "document-function"
|
||||
assert descriptor.safety["network"] is False
|
||||
assert descriptor.metadata["external_policy_services_required"] is False
|
||||
assert {capability.id for capability in descriptor.capabilities} >= {
|
||||
"document_function",
|
||||
"deterministic",
|
||||
}
|
||||
assert descriptor.cli["commands"] == [
|
||||
"mkt function list",
|
||||
"mkt function check",
|
||||
"mkt function render",
|
||||
]
|
||||
|
||||
@@ -18,6 +18,7 @@ def test_collect_cli_command_specs_from_builtin_registry():
|
||||
assert ("processor.uppercase", "mkt process") in commands
|
||||
assert ("backend.local-sqlite", "mkt cache index") in commands
|
||||
assert ("backend.local-sqlite", "mkt search") in commands
|
||||
assert ("document.function", "mkt function render") in commands
|
||||
|
||||
|
||||
def test_cli_command_spec_serializes_without_empty_fields():
|
||||
|
||||
136
tests/test_document_functions.py
Normal file
136
tests/test_document_functions.py
Normal file
@@ -0,0 +1,136 @@
|
||||
import json
|
||||
from pathlib import Path
|
||||
|
||||
from click.testing import CliRunner
|
||||
|
||||
from markitect_tool.cli import main
|
||||
from markitect_tool.document_function import (
|
||||
DocumentFunctionDescriptor,
|
||||
DocumentFunctionParameter,
|
||||
DocumentFunctionRegistry,
|
||||
default_document_function_registry,
|
||||
parse_document_function_calls,
|
||||
render_document_functions,
|
||||
validate_document_functions,
|
||||
)
|
||||
from markitect_tool.extension import ProcessingContext
|
||||
|
||||
|
||||
def test_parse_inline_and_fenced_function_calls():
|
||||
text = """# Demo
|
||||
|
||||
Inline {{mkt:text.upper "draft"}}.
|
||||
|
||||
```mkt-function md.heading level=3
|
||||
Decision
|
||||
```
|
||||
"""
|
||||
|
||||
calls = parse_document_function_calls(text)
|
||||
|
||||
assert [call.function_id for call in calls] == ["text.upper", "md.heading"]
|
||||
assert calls[0].args == ["draft"]
|
||||
assert calls[1].kwargs == {"level": 3}
|
||||
assert calls[1].body.strip() == "Decision"
|
||||
|
||||
|
||||
def test_render_document_functions_expands_inline_and_fenced_calls():
|
||||
text = """# Demo
|
||||
|
||||
Inline {{mkt:text.upper "draft"}}.
|
||||
|
||||
```mkt-function md.heading level=3
|
||||
Decision
|
||||
```
|
||||
"""
|
||||
|
||||
result = render_document_functions(text)
|
||||
|
||||
assert result.valid
|
||||
assert "Inline DRAFT." in result.content
|
||||
assert "### Decision" in result.content
|
||||
assert len(result.calls) == 2
|
||||
assert result.provenance[0].operation == "document_function.text.upper"
|
||||
|
||||
|
||||
def test_pipeline_passes_previous_output_to_next_function():
|
||||
result = render_document_functions('{{mkt:text.upper "draft" | text.replace DRAFT Final}}')
|
||||
|
||||
assert result.valid
|
||||
assert result.content == "Final"
|
||||
|
||||
|
||||
def test_context_variables_can_be_used_in_function_arguments():
|
||||
context = ProcessingContext(variables={"title": "Architecture Decision"})
|
||||
|
||||
result = render_document_functions("{{mkt:md.heading ${title} level=2}}", context=context)
|
||||
|
||||
assert result.content == "## Architecture Decision"
|
||||
|
||||
|
||||
def test_validate_document_functions_reports_forbidden_calls():
|
||||
result = validate_document_functions("{{mkt:text.upper draft}}", forbidden=["text.upper"])
|
||||
|
||||
assert not result.valid
|
||||
assert result.diagnostics[0].code == "function.forbidden"
|
||||
|
||||
|
||||
def test_registry_can_expose_custom_function_without_core_rewrite():
|
||||
registry = DocumentFunctionRegistry()
|
||||
registry.register(
|
||||
DocumentFunctionDescriptor(
|
||||
id="demo.wrap",
|
||||
summary="Wrap text.",
|
||||
parameters=[DocumentFunctionParameter("value")],
|
||||
implementation=lambda value: f"[{value}]",
|
||||
)
|
||||
)
|
||||
|
||||
result = render_document_functions("{{mkt:demo.wrap ok}}", registry=registry)
|
||||
|
||||
assert result.valid
|
||||
assert result.content == "[ok]"
|
||||
|
||||
|
||||
def test_unknown_function_is_left_in_place_with_diagnostic():
|
||||
result = render_document_functions("{{mkt:nope.missing value}}")
|
||||
|
||||
assert not result.valid
|
||||
assert result.content == "{{mkt:nope.missing value}}"
|
||||
assert result.diagnostics[0].code == "function.unknown"
|
||||
|
||||
|
||||
def test_mkt_function_list_outputs_builtin_catalog():
|
||||
result = CliRunner().invoke(main, ["function", "list", "--format", "json"])
|
||||
data = json.loads(result.output)
|
||||
|
||||
assert result.exit_code == 0
|
||||
ids = {function["id"] for function in data["functions"]}
|
||||
assert {"text.upper", "md.heading", "md.codeblock"} <= ids
|
||||
|
||||
|
||||
def test_mkt_function_render_outputs_expanded_markdown(tmp_path: Path):
|
||||
file = tmp_path / "functions.md"
|
||||
file.write_text("# Demo\n\n{{mkt:md.bold Important}}\n", encoding="utf-8")
|
||||
|
||||
result = CliRunner().invoke(main, ["function", "render", str(file)])
|
||||
|
||||
assert result.exit_code == 0
|
||||
assert "**Important**" in result.output
|
||||
|
||||
|
||||
def test_mkt_function_check_can_restrict_allowed_functions(tmp_path: Path):
|
||||
file = tmp_path / "functions.md"
|
||||
file.write_text("{{mkt:text.upper draft}}\n", encoding="utf-8")
|
||||
|
||||
result = CliRunner().invoke(main, ["function", "check", str(file), "--allow", "md.heading"])
|
||||
|
||||
assert result.exit_code == 1
|
||||
assert "function.not_allowed" in result.output
|
||||
|
||||
|
||||
def test_default_registry_serializes_without_implementations():
|
||||
data = default_document_function_registry().to_dict()
|
||||
|
||||
assert data["count"] >= 1
|
||||
assert "implementation" not in data["functions"][0]
|
||||
@@ -3,10 +3,10 @@ id: MKTT-WP-0012
|
||||
type: workplan
|
||||
title: "Document Function Layer"
|
||||
domain: markitect
|
||||
status: todo
|
||||
status: done
|
||||
owner: markitect-tool
|
||||
topic_slug: markitect
|
||||
planning_priority: P3
|
||||
planning_priority: complete
|
||||
planning_order: 85
|
||||
depends_on_workplans:
|
||||
- MKTT-WP-0004
|
||||
@@ -34,6 +34,29 @@ This layer should let authors and agents express reusable document operations
|
||||
as named functions over Markdown content, structured data, references,
|
||||
processors, contracts, workflows, and eventually assisted generation.
|
||||
|
||||
## Implementation Summary
|
||||
|
||||
Implemented the first deterministic document function layer:
|
||||
|
||||
- `DocumentFunctionDescriptor`, `DocumentFunctionParameter`,
|
||||
`DocumentFunctionCall`, `DocumentFunctionRegistry`, run/evaluation result
|
||||
envelopes, diagnostics, provenance, and trace output.
|
||||
- Conservative inline syntax: `{{mkt:function.name ...}}`.
|
||||
- Conservative fenced syntax: `mkt-function function.name ...`.
|
||||
- Pipeline chaining with `|`, where the previous result becomes the next
|
||||
function's first argument.
|
||||
- `ProcessingContext.variables` bindings through `${name}` values.
|
||||
- Built-in deterministic functions for text operations, Markdown headings,
|
||||
bold text, links, code blocks, and context value lookup.
|
||||
- `mkt function list`, `mkt function check`, and `mkt function render`.
|
||||
- Built-in extension descriptor `document.function`.
|
||||
- Documentation and examples in `docs/document-functions.md` and
|
||||
`examples/functions/basic-functions.md`.
|
||||
|
||||
Assisted, filesystem, network, external-process, render/export, and live policy
|
||||
service functions remain future optional extensions gated by local capability
|
||||
and policy metadata.
|
||||
|
||||
## Background
|
||||
|
||||
Quarkdown demonstrates that document authoring can benefit from a compact
|
||||
@@ -102,7 +125,7 @@ a second workflow engine or a dependency on flex-auth.
|
||||
|
||||
```task
|
||||
id: MKTT-WP-0012-T001
|
||||
status: todo
|
||||
status: done
|
||||
priority: high
|
||||
state_hub_task_id: "13ddfdbb-8fc1-4570-915d-038d40d489e1"
|
||||
```
|
||||
@@ -125,7 +148,7 @@ Output: design note, schema, and small examples.
|
||||
|
||||
```task
|
||||
id: MKTT-WP-0012-T002
|
||||
status: todo
|
||||
status: done
|
||||
priority: high
|
||||
state_hub_task_id: "58166792-457a-4844-96a7-27baf50c1d7e"
|
||||
```
|
||||
@@ -150,7 +173,7 @@ Output: syntax proposal with accepted/rejected examples and parser impact.
|
||||
|
||||
```task
|
||||
id: MKTT-WP-0012-T003
|
||||
status: todo
|
||||
status: done
|
||||
priority: high
|
||||
state_hub_task_id: "06196bde-cc10-464e-9d1a-6b8acc616c06"
|
||||
```
|
||||
@@ -172,7 +195,7 @@ Output: registry API, adapter protocol, and tests with fake functions.
|
||||
|
||||
```task
|
||||
id: MKTT-WP-0012-T004
|
||||
status: todo
|
||||
status: done
|
||||
priority: high
|
||||
state_hub_task_id: "986121f0-f824-46eb-af59-65ebf2389f34"
|
||||
```
|
||||
@@ -193,7 +216,7 @@ Output: minimal evaluator and CLI/library tests.
|
||||
|
||||
```task
|
||||
id: MKTT-WP-0012-T005
|
||||
status: todo
|
||||
status: done
|
||||
priority: medium
|
||||
state_hub_task_id: "94bb131c-cb4e-4391-8453-bb9de4f3834c"
|
||||
```
|
||||
@@ -210,7 +233,7 @@ Output: chaining rules, data binding rules, and diagnostic examples.
|
||||
|
||||
```task
|
||||
id: MKTT-WP-0012-T006
|
||||
status: todo
|
||||
status: done
|
||||
priority: medium
|
||||
state_hub_task_id: "899d361f-8eaa-4098-97d9-0fd33afc3304"
|
||||
```
|
||||
@@ -233,7 +256,7 @@ Output: contract integration and actionable diagnostics.
|
||||
|
||||
```task
|
||||
id: MKTT-WP-0012-T007
|
||||
status: todo
|
||||
status: done
|
||||
priority: medium
|
||||
state_hub_task_id: "2a51b42c-b46b-42cd-ba33-ab504100e653"
|
||||
```
|
||||
@@ -252,7 +275,7 @@ Output: permission model, blocked-operation diagnostics, and policy examples.
|
||||
|
||||
```task
|
||||
id: MKTT-WP-0012-T008
|
||||
status: todo
|
||||
status: done
|
||||
priority: medium
|
||||
state_hub_task_id: "30358902-5564-48a2-b1e3-e400bfbe7d1a"
|
||||
```
|
||||
|
||||
Reference in New Issue
Block a user