declarative Markdown workflow layer

This commit is contained in:
2026-05-04 12:35:59 +02:00
parent 33fa602fe5
commit 0bc63aab9f
19 changed files with 1854 additions and 11 deletions

102
docs/markdown-workflows.md Normal file
View File

@@ -0,0 +1,102 @@
# Markdown Workflows
Markitect workflows provide declarative orchestration for Markdown-centered
document pipelines.
Use them when you want to:
- collect Markdown files, globs, or directories
- extract frontmatter, sections, blocks, metrics, or selector results
- bind named data products into later steps
- run deterministic template/compose/transform/include steps
- define optional assisted-generation boundaries without requiring a provider
- write one or more Markdown outputs with provenance and diagnostics
The workflow definition standard is documented in
`docs/workflow-definition-standard.md`.
## Commands
Inspect a workflow definition:
```text
mkt workflow inspect examples/workflows/adr-release-notes.workflow.md
```
Plan a run without writing outputs:
```text
mkt workflow plan examples/workflows/adr-release-notes.workflow.md
```
Run and write outputs:
```text
mkt workflow run examples/workflows/adr-release-notes.workflow.md --output-dir build
```
JSON/YAML output is available for agents:
```text
mkt workflow run workflow.md --format json
```
## Execution Model
The first runner is deterministic and local-first:
1. Load a YAML or Markdown-fenced workflow definition.
2. Validate required ids/kinds and duplicate step ids.
3. Collect inputs from Markdown files, directories, globs, or literal values.
4. Resolve `${...}` bindings.
5. Execute steps in dependency order.
6. Render outputs and enforce output path safety.
7. Return diagnostics, provenance, and trace events.
Assisted steps are explicit boundaries. Without an injected generation hook:
- optional assisted steps are skipped with a warning
- required assisted steps fail
This makes workflows useful without provider dependencies.
## Supported Step Kinds
| Kind | Result |
| --- | --- |
| `shape` | Structured data object. |
| `extract` | `items`, `count`, and joined `text`. |
| `query` | query `matches` and `count`. |
| `template` | rendered `markdown`, variables, missing variables, completion flag. |
| `compose` | composed `markdown` and sources. |
| `transform` | transformed `markdown`, operations, provenance. |
| `include` | include-resolved `markdown`, included paths, provenance. |
| `contract_stub` | generated contract stub Markdown. |
| `contract_check` | contract diagnostics and metrics. |
| `assisted` | generated Markdown if a hook is supplied, otherwise skipped/diagnostic. |
## Data Bindings
Bindings use `${...}`:
```yaml
data:
decisions: ${sources.adrs.extracts.decisions}
summary: ${steps.render.markdown}
```
If the full string is one expression, the native type is preserved. If the
expression appears inside a longer string, it is rendered as text.
List projection is supported:
```yaml
paths: ${sources.adrs.items.path}
```
## Relationship To Extensions
The workflow engine is registered as the built-in extension
`workflow.markdown-dataflow`. It uses the canonical architecture for
diagnostics, provenance, trace events, capabilities, and CLI affordance
metadata, but workflow files remain business-facing orchestration artifacts.

View File

@@ -0,0 +1,304 @@
# Markitect Workflow Definition Standard
## Purpose
Markitect workflows describe repeatable Markdown-centered dataflow:
```text
Markdown inputs -> extracted data products -> deterministic/assisted steps
-> artifacts and Markdown outputs
```
The workflow standard is business-facing orchestration. It uses the internal
extension framework for execution semantics, diagnostics, provenance,
capabilities, and future policy gates, but it is not itself the extension
framework.
## File Format
A workflow can be either:
- a YAML file
- a Markdown file with a fenced YAML block tagged `workflow`,
`markitect-workflow`, or `mkt-workflow`
Example:
````markdown
# Release Notes Workflow
```yaml workflow
metadata:
id: release-notes
title: Release Notes
intent:
summary: Build release notes from accepted ADR decisions.
inputs:
adrs:
glob: docs/adr/*.md
extract:
decisions:
selector: sections[heading=Decision]
outputs:
release_notes:
path: out/release-notes.md
content: ${steps.render.markdown}
steps:
render:
kind: template
template: templates/release-notes.md
data:
decisions: ${sources.adrs.extracts.decisions}
```
````
## Top-Level Sections
| Section | Required | Purpose |
| --- | --- | --- |
| `metadata` | recommended | Stable id, title, owner, version, tags, timestamps. |
| `intent` | recommended | Why the workflow exists and what success means. |
| `inputs` | yes | Markdown files, directories, globs, literal values, or future index references. |
| `steps` | yes for processing | Deterministic or assisted operations over bound data. |
| `outputs` | optional | Files/artifacts produced from step or source data. |
| `dependencies` | optional | Workflow-level dependencies on files, workplans, contracts, or other workflows. |
| `conditions` | optional | Preconditions or skip rules. First version records these for inspection. |
| `artifacts` | optional | Named non-output products such as manifests, traces, or reports. |
| `permissions` | optional | Declared filesystem/network/provider/capability requirements. |
| `resources` | optional | CPU, memory, token, model, or storage expectations. |
| `timeouts` | optional | Workflow and step timeout budgets. First version records these. |
| `retry_policies` | optional | Retry rules by step kind or id. First version records these. |
| `escalation_rules` | optional | When human approval or operator attention is needed. |
| `observability` | optional | Events, trace detail, metrics, and audit expectations. |
| `responsibilities` | optional | Human/agent/system boundaries for review, approval, and execution. |
Unknown top-level sections are preserved as `extensions` in the loaded model so
the standard can evolve without immediately breaking older runners.
## Metadata
```yaml
metadata:
id: release-notes
title: Release Notes
version: "1"
owner: documentation
tags: [adr, release]
```
`metadata.id` should be stable. It is used in diagnostics and provenance when
available.
## Intent
```yaml
intent:
summary: Build release notes from accepted decisions.
success_criteria:
- One output file is generated.
- Every accepted ADR contributes its decision section.
```
Intent is descriptive in the first implementation. Later policy and assessment
layers may use it for review or LLM-assisted checks.
## Inputs
Inputs are named source collections or literal values.
```yaml
inputs:
adrs:
glob: docs/adr/*.md
recursive: false
where:
frontmatter.status: accepted
extract:
decisions:
selector: sections[heading=Decision]
status:
selector: frontmatter.status
static_context:
value:
product: Markitect
```
Supported source fields:
- `file`: one Markdown file
- `path`: alias for `file`
- `files`: list of Markdown files
- `glob`: glob pattern relative to the workflow directory
- `directory`: directory of Markdown files
- `recursive`: recurse when using `directory`
- `selector`: selector to collect matches
- `extract`: named selector map
- `metrics`: include document metrics
- `frontmatter`: include frontmatter
- `value`: literal structured value, no file parsing
Each Markdown input produces a collection:
```yaml
items:
- path: docs/adr/001.md
frontmatter: {...}
metrics: {...}
extracts:
decisions:
- "## Decision\n\n..."
extracts:
decisions:
- "## Decision\n\n..."
```
## Steps
Steps are named operations. `steps` may be either a mapping or a list with `id`.
```yaml
steps:
render:
kind: template
template: templates/release-notes.md
data:
decisions: ${sources.adrs.extracts.decisions}
```
Common fields:
- `kind`: step kind
- `depends_on`: other step ids
- `when`: condition expression, reserved for future execution gating
- `optional`: do not fail the whole workflow when this step is skipped or blocked
- `permissions`: step-level permissions
- `timeout`: step-level timeout declaration
- `retry`: step-level retry policy reference or inline rule
- `responsibility`: `human`, `agent`, `system`, or `mixed`
First implementation step kinds:
| Kind | Purpose |
| --- | --- |
| `shape` | Resolve data bindings into a structured object. |
| `extract` | Extract text from a source collection or document. |
| `query` | Return query match envelopes from a source collection or document. |
| `template` | Render a deterministic Markdown template. |
| `compose` | Join Markdown strings or files into one Markdown document. |
| `transform` | Apply deterministic Markdown transforms. |
| `include` | Resolve include markers in Markdown. |
| `contract_stub` | Generate a Markdown stub from a contract. |
| `contract_check` | Check a Markdown document against a contract. |
| `assisted` | Provider-neutral assisted step boundary, optional by default. |
## Data Bindings
Workflow expressions use `${...}` references:
```yaml
data:
decisions: ${sources.adrs.extracts.decisions}
summary: ${steps.render.markdown}
```
If a string is exactly one expression, the resolved value keeps its native type.
If an expression appears inside a longer string, it is rendered as text.
Supported roots:
- `metadata`
- `intent`
- `sources`
- `steps`
- `artifacts`
- `workflow`
Path behavior:
- dictionary keys use dot notation
- numeric list indexes are supported
- applying a field to a list maps that field over every dictionary item
## Outputs
```yaml
outputs:
release_notes:
path: out/release-notes.md
content: ${steps.render.markdown}
```
Supported output fields:
- `path`: output path relative to `--output-dir` or workflow directory
- `content`: Markdown/string/structured value to write
- `template`: optional template path
- `data`: data for template rendering
- `artifact`: optional artifact name
Output paths must stay within the output root.
## Assisted Steps
Assisted steps define the boundary; they do not require a provider dependency:
```yaml
review:
kind: assisted
optional: true
prompt: prompts/review.md
input: ${steps.render.markdown}
data:
rubric: concise release note review
```
When no assisted adapter is supplied:
- optional assisted steps are skipped with a warning diagnostic
- required assisted steps fail with an error diagnostic
This keeps workflows runnable in deterministic environments.
## Permissions And Responsibilities
Permissions and responsibilities are declarative in the first runner and become
policy inputs for later access-control work.
```yaml
permissions:
filesystem:
read: [docs, templates]
write: [out]
network: false
assisted_generation: false
responsibilities:
human:
approves_outputs: true
agent:
may_run_deterministic_steps: true
system:
enforces_path_safety: true
```
## Observability
```yaml
observability:
events:
- workflow.started
- step.completed
- output.written
trace: summary
```
Workflow runs return trace events and diagnostics in JSON/YAML output. Future
backends can persist these events.
## Design Rules
- Keep deterministic execution useful without providers.
- Keep all outputs explainable through provenance and diagnostics.
- Preserve unknown metadata rather than rejecting reasonable future extensions.
- Make assisted, external, networked, or sensitive steps explicit.
- Keep internal extension registration separate from business workflow
orchestration.

View File

@@ -36,7 +36,7 @@ and descriptions mirror the operational view.
| `MKTT-WP-0007` | complete | done | `MKTT-WP-0006` | Advanced query and local index backend is complete: AST inspection, optional JSONPath, SQLite snapshots/metadata, FTS5 search, incremental refresh, and local index CLI. |
| `MKTT-WP-0013` | complete | done | `MKTT-WP-0003`, `MKTT-WP-0004`, `MKTT-WP-0006`, `MKTT-WP-0007`, `MKTT-WP-0010` | Internal extension framework is complete: characterization tests, canonical processing model, descriptors, registries, lifecycle callbacks, query-engine registry, built-in extension catalog, CLI command specs, and authoring guide. |
| `MKTT-WP-0005` | P2 | todo | `MKTT-WP-0003`, `MKTT-WP-0004` | Pick up when generation/form/context or semantic assessment pressure appears. |
| `MKTT-WP-0011` | P2 | todo | `MKTT-WP-0003`; task-level triggers: `MKTT-WP-0010-T001`, `MKTT-WP-0010-T005` | Declarative Markdown dataflow workflows: source extraction, deterministic/assisted processing, and multi-output generation. |
| `MKTT-WP-0011` | complete | done | `MKTT-WP-0003`; task-level triggers: `MKTT-WP-0010-T001`, `MKTT-WP-0010-T005` | Markdown dataflow workflow layer is complete: workflow standard, source collectors, binding model, deterministic steps, assisted boundary, safe outputs, CLI, docs, and examples. |
| `MKTT-WP-0009` | P2 | todo | `MKTT-WP-0006` | Establish access-control gateway before security-sensitive cache/context use. |
| `MKTT-WP-0012` | P3 | todo | `MKTT-WP-0004`, `MKTT-WP-0010`, `MKTT-WP-0011` | Future Quarkdown-inspired document function layer: reusable Markdown-native function calls over processors, references, contracts, workflows, and later assisted steps. |
| `MKTT-WP-0008` | P3 | todo | `MKTT-WP-0006`, `MKTT-WP-0007`, `MKTT-WP-0009` | Agent working-memory cache after backend and policy floor are available. |