diff --git a/docs/document-functions.md b/docs/document-functions.md index ffd519a..3cadad2 100644 --- a/docs/document-functions.md +++ b/docs/document-functions.md @@ -50,6 +50,12 @@ function: {{mkt:text.upper "draft" | text.replace DRAFT Final}} ``` +Quoted pipe characters remain literal: + +```markdown +{{mkt:text.replace "a|b" "|" "/"}} +``` + Values of the form `${name}` are resolved from `ProcessingContext.variables`. This keeps data binding aligned with workflow expression conventions without creating a second workflow engine. @@ -126,6 +132,24 @@ capabilities before execution. External policy services may provide decisions through adapters later, but deterministic function execution has no external service dependency. +## Natural Extensions + +The deterministic layer deliberately stops before becoming a full publishing +language. Future extension work is captured in +`MKTT-WP-0015: Render And Document Function Extensions`. + +That workplan should consider: + +- typed document values and value-to-Markdown mapping +- richer multiline and nested function syntax +- document-local reusable functions +- render/export adapters, including optional Quarkdown source export +- render-aware numbering, references, tables, figures, equations, and code + blocks +- static asset and media manifests with checksums +- local permission gates for filesystem, network, external process, assisted, + and render/export functions + ## Design Rules - Stay close to Markdown and preserve CommonMark documents unless function diff --git a/docs/workplan-planning-map.md b/docs/workplan-planning-map.md index e7d8e08..efed49a 100644 --- a/docs/workplan-planning-map.md +++ b/docs/workplan-planning-map.md @@ -41,6 +41,7 @@ and descriptions mirror the operational view. | `MKTT-WP-0014` | complete | done | `MKTT-WP-0009` | Markitect-side enterprise IAM access-control integration is complete: NetKingdom/key-cape-compatible identity claims, flex-auth resource/policy contract, directory group resolution fixtures, decision-log sink, workflow declarations, CLI commands, and external PDP request examples. | | `MKTT-WP-0012` | complete | done | `MKTT-WP-0004`, `MKTT-WP-0010`, `MKTT-WP-0011` | Document function layer is complete: deterministic Markdown-native function descriptors, registry, inline/fenced syntax, pipelines, context bindings, CLI, docs, examples, diagnostics, provenance, and extension descriptor. | | `MKTT-WP-0008` | P3 | todo | `MKTT-WP-0006`, `MKTT-WP-0007`, `MKTT-WP-0009` | Agent working-memory cache after backend and policy floor are available. | +| `MKTT-WP-0015` | P2 | todo | `MKTT-WP-0010`, `MKTT-WP-0011`, `MKTT-WP-0012` | Future render and document-function extensions: typed values, richer syntax, document-local reusable functions, Quarkdown/export adapters, render-aware references, assets, and permission sandboxing. Defer unless publishing/export pressure becomes current. | ## Dependency Notes @@ -74,6 +75,12 @@ deterministic authoring surface over existing Markitect capabilities. Assisted, external, file, network, render/export, and provider-backed functions remain future optional extensions behind local capability and policy gates. +`MKTT-WP-0015` captures those optional future extensions explicitly. It should +not disturb the deterministic core. Its Quarkdown lesson is the typed value +and render pipeline shape: functions can return document values that are mapped +back to renderable content, while render/export, media, permissions, and +numbering stay explicit extension concerns. + `MKTT-WP-0014` completed Markitect-side enterprise IAM integration for the access-control gateway. Central authorization administration remains optional external-service scope; Markitect now provides resource registration, policy @@ -84,8 +91,8 @@ protocols. A live flex-auth service can improve enterprise deployment, central policy administration, and durable audit, but it is not a prerequisite for the document function layer or local agent context packages. -`MKTT-WP-0012` and `MKTT-WP-0008` are the remaining Markitect workplans. Their -policy posture should be: +Remaining Markitect workplans, including `MKTT-WP-0008` and the future +`MKTT-WP-0015` extension track, should keep this policy posture: - use `AccessPolicyGateway`, `PolicySubject`, `PolicyObject`, and `PolicyDecision` as local contracts @@ -126,3 +133,6 @@ dependencies: - `MKTT-WP-0008 -> MKTT-WP-0006` - `MKTT-WP-0008 -> MKTT-WP-0007` - `MKTT-WP-0008 -> MKTT-WP-0009` +- `MKTT-WP-0015 -> MKTT-WP-0010` +- `MKTT-WP-0015 -> MKTT-WP-0011` +- `MKTT-WP-0015 -> MKTT-WP-0012` diff --git a/src/markitect_tool/document_function.py b/src/markitect_tool/document_function.py index ded3cf3..c07670c 100644 --- a/src/markitect_tool/document_function.py +++ b/src/markitect_tool/document_function.py @@ -278,11 +278,9 @@ class DocumentFunctionRegistry: _validate_arguments(descriptor, args, kwargs) if descriptor.id == "data.get": output = context.variables.get(str(args[0]), kwargs.get("default", "")) - raise _FunctionOutputReady(output) - assert descriptor.implementation is not None - output = descriptor.implementation(*args, **kwargs) - except _FunctionOutputReady as ready: - output = ready.output + else: + assert descriptor.implementation is not None + output = descriptor.implementation(*args, **kwargs) except Exception as exc: return _call_error(call, "function.evaluation_failed", str(exc), context) @@ -513,22 +511,34 @@ def validate_document_functions( diagnostics: list[Diagnostic] = [] runs: list[DocumentFunctionRun] = [] for call in parse_document_function_calls(text): - if allowed_set and call.function_id not in allowed_set: - diagnostics.append(_diagnostic(call, "function.not_allowed", f"Function `{call.function_id}` is not allowed.")) - if call.function_id in forbidden_set: - diagnostics.append(_diagnostic(call, "function.forbidden", f"Function `{call.function_id}` is forbidden.")) - try: - descriptor = registry.get(call.function_id) - if descriptor.execution != "deterministic": + for index, current in enumerate([call, *call.pipeline]): + if allowed_set and current.function_id not in allowed_set: diagnostics.append( _diagnostic( - call, - "function.unstable", - f"Function `{call.function_id}` is `{descriptor.execution}` and cannot run in deterministic contexts.", + current, + "function.not_allowed", + f"Function `{current.function_id}` is not allowed.", ) ) - except DocumentFunctionError as exc: - diagnostics.append(_diagnostic(call, "function.unknown", str(exc))) + if current.function_id in forbidden_set: + diagnostics.append( + _diagnostic(current, "function.forbidden", f"Function `{current.function_id}` is forbidden.") + ) + try: + descriptor = registry.get(current.function_id) + if descriptor.execution != "deterministic": + diagnostics.append( + _diagnostic( + current, + "function.unstable", + f"Function `{current.function_id}` is `{descriptor.execution}` and cannot run in deterministic contexts.", + ) + ) + args = current.args if index == 0 else ["", *current.args] + _validate_arguments(descriptor, args, current.kwargs) + except DocumentFunctionError as exc: + code = "function.unknown" if str(exc).startswith("Unknown document function") else "function.arguments" + diagnostics.append(_diagnostic(current, code, str(exc))) runs.append(DocumentFunctionRun(call=call)) return DocumentFunctionEvaluationResult(content=text, calls=runs, diagnostics=diagnostics) @@ -541,7 +551,7 @@ def _parse_call_expression( line: int | None, body: str | None = None, ) -> DocumentFunctionCall: - pipeline_parts = [part.strip() for part in expression.split("|") if part.strip()] + pipeline_parts = _split_pipeline_expression(expression) if not pipeline_parts: raise DocumentFunctionError("Document function call is empty.") first = _parse_single_call(pipeline_parts[0], raw=raw, inline=inline, line=line, body=body) @@ -627,6 +637,18 @@ def _validate_arguments( required = [parameter for parameter in descriptor.parameters if parameter.required and not parameter.variadic] positional = [parameter for parameter in descriptor.parameters if not parameter.variadic] variadic = next((parameter for parameter in descriptor.parameters if parameter.variadic), None) + parameter_names = {parameter.name for parameter in descriptor.parameters} + unknown = sorted(set(kwargs) - parameter_names) + if unknown: + raise DocumentFunctionError( + f"Function `{descriptor.id}` received unknown named argument `{unknown[0]}`." + ) + if variadic is None: + for index, parameter in enumerate(positional[: len(args)]): + if parameter.name in kwargs: + raise DocumentFunctionError( + f"Function `{descriptor.id}` received `{parameter.name}` both positionally and by name." + ) if len(args) > len(positional) and variadic is None: raise DocumentFunctionError(f"Function `{descriptor.id}` received too many positional arguments.") for index, parameter in enumerate(required): @@ -635,6 +657,42 @@ def _validate_arguments( raise DocumentFunctionError(f"Function `{descriptor.id}` requires `{parameter.name}`.") +def _split_pipeline_expression(expression: str) -> list[str]: + parts: list[str] = [] + current: list[str] = [] + quote: str | None = None + escaped = False + for char in expression: + if escaped: + current.append(char) + escaped = False + continue + if char == "\\": + current.append(char) + escaped = True + continue + if char in {"'", '"'}: + if quote == char: + quote = None + elif quote is None: + quote = char + current.append(char) + continue + if char == "|" and quote is None: + part = "".join(current).strip() + if part: + parts.append(part) + current = [] + continue + current.append(char) + if quote is not None: + raise DocumentFunctionError("Invalid function pipeline: unterminated quote.") + part = "".join(current).strip() + if part: + parts.append(part) + return parts + + def _blocked_capabilities( descriptor: DocumentFunctionDescriptor, context: ProcessingContext, @@ -778,11 +836,6 @@ def _data_get(key: Any, default: Any = "", *, body: Any = None) -> Any: return body if body is not None else default if str(key).startswith("$") else key -class _FunctionOutputReady(Exception): - def __init__(self, output: Any) -> None: - self.output = output - - def _drop_empty(data: dict[str, Any]) -> dict[str, Any]: return { key: value diff --git a/tests/test_document_functions.py b/tests/test_document_functions.py index e2f6bee..95a7443 100644 --- a/tests/test_document_functions.py +++ b/tests/test_document_functions.py @@ -60,6 +60,13 @@ def test_pipeline_passes_previous_output_to_next_function(): assert result.content == "Final" +def test_pipeline_separator_inside_quotes_is_literal(): + result = render_document_functions('{{mkt:text.replace "a|b" "|" "/"}}') + + assert result.valid + assert result.content == "a/b" + + def test_context_variables_can_be_used_in_function_arguments(): context = ProcessingContext(variables={"title": "Architecture Decision"}) @@ -75,6 +82,13 @@ def test_validate_document_functions_reports_forbidden_calls(): assert result.diagnostics[0].code == "function.forbidden" +def test_validate_document_functions_reports_argument_errors(): + result = validate_document_functions("{{mkt:text.upper draft unexpected=value}}") + + assert not result.valid + assert result.diagnostics[0].code == "function.arguments" + + def test_registry_can_expose_custom_function_without_core_rewrite(): registry = DocumentFunctionRegistry() registry.register( diff --git a/workplans/MKTT-WP-0012-document-function-layer.md b/workplans/MKTT-WP-0012-document-function-layer.md index 2843592..2209905 100644 --- a/workplans/MKTT-WP-0012-document-function-layer.md +++ b/workplans/MKTT-WP-0012-document-function-layer.md @@ -45,7 +45,9 @@ Implemented the first deterministic document function layer: - Conservative fenced syntax: `mkt-function function.name ...`. - Pipeline chaining with `|`, where the previous result becomes the next function's first argument. +- Quoted `|` characters remain literal inside function arguments. - `ProcessingContext.variables` bindings through `${name}` values. +- Validation reports unknown or duplicate named arguments before execution. - Built-in deterministic functions for text operations, Markdown headings, bold text, links, code blocks, and context value lookup. - `mkt function list`, `mkt function check`, and `mkt function render`. @@ -53,9 +55,10 @@ Implemented the first deterministic document function layer: - Documentation and examples in `docs/document-functions.md` and `examples/functions/basic-functions.md`. -Assisted, filesystem, network, external-process, render/export, and live policy -service functions remain future optional extensions gated by local capability -and policy metadata. +Assisted, filesystem, network, external-process, render/export, typed-value, +asset, and live policy service functions remain future optional extensions +gated by local capability and policy metadata. The follow-on extension track is +captured in `MKTT-WP-0015`. ## Background diff --git a/workplans/MKTT-WP-0015-render-and-document-function-extensions.md b/workplans/MKTT-WP-0015-render-and-document-function-extensions.md new file mode 100644 index 0000000..5686bc0 --- /dev/null +++ b/workplans/MKTT-WP-0015-render-and-document-function-extensions.md @@ -0,0 +1,206 @@ +--- +id: MKTT-WP-0015 +type: workplan +title: "Render And Document Function Extensions" +domain: markitect +status: todo +owner: markitect-tool +topic_slug: markitect +planning_priority: P2 +planning_order: 130 +depends_on_workplans: + - MKTT-WP-0010 + - MKTT-WP-0011 + - MKTT-WP-0012 +related_workplans: + - MKTT-WP-0007 + - MKTT-WP-0008 + - MKTT-WP-0009 + - MKTT-WP-0013 +created: "2026-05-04" +updated: "2026-05-04" +state_hub_workstream_id: "a38f676a-0d0b-493c-9792-2e34480c3681" +--- + +# MKTT-WP-0015: Render And Document Function Extensions + +## Purpose + +Capture the natural follow-on work from the Quarkdown comparison and the first +Markitect document-function layer. + +The current function layer is intentionally small: deterministic functions, +Markdown-native explicit syntax, local context variables, diagnostics, +provenance, and capability metadata. This workplan should extend that model +only when the need is concrete, keeping the core framework clean and avoiding a +second workflow engine. + +## Background + +Quarkdown shows the value of a document language where functions are not just +macros. They can return typed values, Markdown content, layout structures, +tables, dictionaries, booleans, and renderable nodes. Its compiler expands +function-call nodes, maps output values back to renderable nodes, and then +continues through traversal, rendering, and post-rendering stages. + +Markitect should not become a Quarkdown clone. The better fit is: + +- keep Markitect as the contract, reference, processor, workflow, cache, + provenance, and policy framework +- make document functions an authoring surface over those primitives +- add render/export behavior as optional extensions +- use Quarkdown as an optional external publishing target where that is useful + +## Decision + +Defer this work until after the current original successor work is stable, +unless a concrete document publishing, render provenance, or function-language +use case becomes urgent. + +When picked up, treat this as an extension workplan. It may evolve framework +interfaces, but should not make Quarkdown, flex-auth, network access, live LLM +calls, filesystem writes, or external processes required for deterministic +Markitect parsing and function validation. + +## P15.1 - Typed document values and value mapping + +```task +id: MKTT-WP-0015-T001 +status: todo +priority: high +state_hub_task_id: "995945c5-6cec-435c-8943-b8da0a9ff89d" +``` + +Define a typed value model for document functions: + +- string, number, boolean, none +- Markdown content +- list and dictionary values +- references and content units +- tables and records +- diagnostics-friendly unknown or dynamic values + +Define how each value maps back to Markdown or structured output. Keep the +mapper deterministic and inspectable. + +Output: value model, mapper API, tests, and documentation. + +## P15.2 - Richer function syntax without losing Markdown compatibility + +```task +id: MKTT-WP-0015-T002 +status: todo +priority: medium +state_hub_task_id: "bfce1388-e123-4e91-a5ab-ba67d21c22b8" +``` + +Evaluate syntax extensions that improve author ergonomics without turning +Markitect into a full compiler language: + +- multiline argument continuation +- nested function expressions +- clearer escaping rules +- block-body argument refinements +- source spans beyond line numbers +- cycle and depth limits for nested calls + +Output: syntax compatibility note, parser tests, and diagnostics examples. + +## P15.3 - Document-local reusable functions + +```task +id: MKTT-WP-0015-T003 +status: todo +priority: medium +state_hub_task_id: "a8a8f017-3622-47f1-814e-0c71bd49a42f" +``` + +Explore document-local reusable functions as a constrained, contract-aware +extension: + +- named reusable snippets +- parameter lists and default values +- body arguments +- provenance for expansions +- validation against allowed function namespaces + +Avoid general-purpose Turing-complete scripting in core. If assisted or +external behavior is needed, route it through workflow steps and explicit +capability gates. + +Output: design proposal and one deterministic prototype if justified. + +## P15.4 - Quarkdown and render/export adapters + +```task +id: MKTT-WP-0015-T004 +status: todo +priority: high +state_hub_task_id: "69e550a0-188b-4bc4-9658-47219b090904" +``` + +Design optional render/export adapters: + +- emit Quarkdown source from Markitect references, processors, templates, and + function calls +- support output profiles such as plain, docs, slides, paged, and static site +- invoke external renderers only through declared capabilities +- keep direct code reuse license-safe +- track source to rendered-artifact provenance + +Output: adapter interface, Quarkdown export sketch, policy model, and tests +with deterministic fake renderers. + +## P15.5 - Render-aware references, numbering, and assets + +```task +id: MKTT-WP-0015-T005 +status: todo +priority: high +state_hub_task_id: "53eb9f94-830b-4fdf-bb47-3f549048c82a" +``` + +Extend the reference model for rendered documents: + +- figures, tables, equations, code blocks, and custom numbered units +- generated table of contents and cross-reference links +- static asset manifests +- media checksums and copy policies +- root output asset references + +Output: reference/asset manifest model and docs with examples. + +## P15.6 - Permission sandbox for non-core functions + +```task +id: MKTT-WP-0015-T006 +status: todo +priority: high +state_hub_task_id: "9ef2c516-2cd0-40ba-b270-abefbfd8fc40" +``` + +Add explicit local permission gates for functions that need: + +- filesystem reads or writes +- network access +- external processes +- native content inclusion +- assisted generation +- render/export side effects + +Use Markitect-local policy contracts first. flex-auth, OpenFGA, OPA, Cedar, +Keycloak, Entra, or similar systems may be optional adapters, but must not be +required for deterministic function parsing, validation, and rendering of pure +functions. + +Output: permission vocabulary, denied-operation diagnostics, and policy tests. + +## Exit Criteria + +- Core deterministic document functions remain simple and dependency-light. +- Richer functions are optional extensions with declared capabilities. +- Render/export adapters can be tested without live external services. +- Quarkdown interoperability is conceptually supported without direct code + dependency. +- Typed values, render provenance, references, and assets have clear contracts. +- The extension does not duplicate the dataflow workflow engine.