generated from coulomb/repo-seed
218 lines
6.5 KiB
Markdown
218 lines
6.5 KiB
Markdown
---
|
|
id: MKTT-WP-0011
|
|
type: workplan
|
|
title: "Markdown Dataflow Pipeline Workflows"
|
|
domain: markitect
|
|
status: done
|
|
owner: markitect-tool
|
|
topic_slug: markitect
|
|
planning_priority: P2
|
|
planning_order: 75
|
|
depends_on_workplans:
|
|
- MKTT-WP-0003
|
|
depends_on_tasks:
|
|
- MKTT-WP-0010-T001
|
|
- MKTT-WP-0010-T005
|
|
related_workplans:
|
|
- MKTT-WP-0005
|
|
- MKTT-WP-0006
|
|
- MKTT-WP-0008
|
|
- MKTT-WP-0009
|
|
created: "2026-05-04"
|
|
updated: "2026-05-04"
|
|
state_hub_workstream_id: "ed4c491d-4f81-4df0-af51-5f4bd4d1ad91"
|
|
---
|
|
|
|
# MKTT-WP-0011: Markdown Dataflow Pipeline Workflows
|
|
|
|
## Purpose
|
|
|
|
Create a declarative workflow layer for Markdown-to-Markdown dataflow:
|
|
collecting data from one or more Markdown sources, applying deterministic and
|
|
optional assisted processing, and injecting the results into one or more
|
|
Markdown outputs.
|
|
|
|
## Background
|
|
|
|
The current toolkit has strong primitives: parse, query, extract, transform,
|
|
compose, include, template render, contract stub generation, generation plans,
|
|
and a provider-neutral assisted-generation hook.
|
|
|
|
What is missing is orchestration. Users can script the pieces manually, but
|
|
there is not yet a first-class workflow model for:
|
|
|
|
```text
|
|
Markdown sources -> extracted data products -> processors -> generated outputs
|
|
```
|
|
|
|
See `docs/markdown-dataflow-workflow-assessment.md`.
|
|
|
|
## P11.1 - Define workflow plan model
|
|
|
|
```task
|
|
id: MKTT-WP-0011-T001
|
|
status: done
|
|
priority: high
|
|
state_hub_task_id: "c335cbaa-dfb9-4df5-b1ae-87aaf6097bd8"
|
|
```
|
|
|
|
Define a Markdown/YAML workflow plan format with sources, named data products,
|
|
steps, outputs, variables, dry-run behavior, diagnostics, and provenance.
|
|
|
|
Output: workflow schema, examples, and validation diagnostics.
|
|
|
|
Implemented: `docs/workflow-definition-standard.md` defines the workflow
|
|
standard with metadata, intent, inputs, outputs, steps, dependencies,
|
|
conditions, artifacts, permissions, resource requirements, timeouts, retry
|
|
policies, escalation rules, observability events, and human/agent/system
|
|
responsibility boundaries. `WorkflowPlan` preserves all standard sections and
|
|
unknown extension fields.
|
|
|
|
## P11.2 - Implement Markdown source collectors
|
|
|
|
```task
|
|
id: MKTT-WP-0011-T002
|
|
status: done
|
|
priority: high
|
|
state_hub_task_id: "16a89801-d96d-437f-a883-81d09586f47a"
|
|
```
|
|
|
|
Collect source data from files, globs, directories, frontmatter paths,
|
|
selectors, sections, blocks, metrics, and future reference/index backends.
|
|
|
|
Output: source collector API, selector integration, and tests.
|
|
|
|
Implemented: workflow inputs collect `file`, `path`, `files`, `glob`, and
|
|
`directory` Markdown sources, support frontmatter filters, metrics,
|
|
frontmatter, selectors, and named extractions.
|
|
|
|
## P11.3 - Implement deterministic step registry
|
|
|
|
```task
|
|
id: MKTT-WP-0011-T003
|
|
status: done
|
|
priority: high
|
|
state_hub_task_id: "808bed93-c7e2-4b34-90f4-f6f961fef503"
|
|
```
|
|
|
|
Create step types for query/extract, transform, compose, include, template
|
|
render, contract stub generation, contract checks, and data shaping.
|
|
|
|
Output: deterministic workflow runner with dependency ordering.
|
|
|
|
Implemented: deterministic workflow runner supports dependency ordering and
|
|
step kinds `shape`, `extract`, `query`, `template`, `compose`, `transform`,
|
|
`include`, `contract_stub`, `contract_check`, and `assisted` boundary.
|
|
|
|
## P11.4 - Implement data expression and binding model
|
|
|
|
```task
|
|
id: MKTT-WP-0011-T004
|
|
status: done
|
|
priority: high
|
|
state_hub_task_id: "ea1ad9d2-3668-4b65-afb4-f490e5bfd0c6"
|
|
```
|
|
|
|
Allow workflow steps and outputs to reference previous results by stable names,
|
|
for example `${sources.adrs.decisions}` or `${steps.summary.markdown}`.
|
|
|
|
Output: expression resolver, type checks, and missing-reference diagnostics.
|
|
|
|
Implemented: `${...}` bindings preserve native types for full-expression
|
|
values, interpolate text inside longer strings, support dictionary paths,
|
|
numeric list indexes, and list projection over dictionaries.
|
|
|
|
## P11.5 - Add optional assisted processing step boundary
|
|
|
|
```task
|
|
id: MKTT-WP-0011-T005
|
|
status: done
|
|
priority: medium
|
|
state_hub_task_id: "ed1adc60-fdd8-4d4c-b4d7-7ce906e641c6"
|
|
```
|
|
|
|
Add assisted step support through the provider-neutral generation hook protocol.
|
|
The workflow engine must not require provider dependencies and must support
|
|
dry-run, optional steps, and policy gates before sending data to a provider.
|
|
|
|
Output: hook adapter interface and tests with fake providers.
|
|
|
|
Implemented: assisted steps use the provider-neutral generation hook boundary.
|
|
Without a hook, optional assisted steps are skipped with warning diagnostics and
|
|
required assisted steps fail. Tests include an injected fake hook.
|
|
|
|
## P11.6 - Implement multi-output sinks
|
|
|
|
```task
|
|
id: MKTT-WP-0011-T006
|
|
status: done
|
|
priority: high
|
|
state_hub_task_id: "902707d7-46fe-45d6-a9ec-b85763065ff9"
|
|
```
|
|
|
|
Support writing one or many Markdown outputs from templates, generated content,
|
|
or composed results. Outputs must be path-safe, reproducible, and traceable to
|
|
their source data.
|
|
|
|
Output: output sink API, path-safety checks, and provenance manifests.
|
|
|
|
Implemented: outputs can render content or templates, write multiple Markdown
|
|
files under a safe output root, support dry-run planning, and emit output
|
|
records plus provenance/trace events.
|
|
|
|
## P11.7 - Add workflow CLI
|
|
|
|
```task
|
|
id: MKTT-WP-0011-T007
|
|
status: done
|
|
priority: high
|
|
state_hub_task_id: "ccc26867-5724-4205-b3fe-a8b9d046775d"
|
|
```
|
|
|
|
Add:
|
|
|
|
```text
|
|
mkt workflow inspect <workflow.md>
|
|
mkt workflow plan <workflow.md>
|
|
mkt workflow run <workflow.md>
|
|
```
|
|
|
|
Include JSON/YAML outputs for agent use.
|
|
|
|
Implemented:
|
|
|
|
- `mkt workflow inspect <workflow.md>`
|
|
- `mkt workflow plan <workflow.md>`
|
|
- `mkt workflow run <workflow.md>`
|
|
|
|
All commands support text, JSON, and YAML output.
|
|
|
|
## P11.8 - Add representative end-to-end examples
|
|
|
|
```task
|
|
id: MKTT-WP-0011-T008
|
|
status: done
|
|
priority: high
|
|
state_hub_task_id: "f8501ea6-1ead-477d-8f64-c196e7edfe68"
|
|
```
|
|
|
|
Create examples covering:
|
|
|
|
- multiple ADRs -> release notes
|
|
- contract data -> generated documents
|
|
- source snippets -> docs
|
|
- deterministic summary -> optional assisted review -> final Markdown
|
|
|
|
Implemented: examples under `examples/workflows/` cover ADR release notes,
|
|
source snippet extraction, and an optional assisted review boundary.
|
|
|
|
## Exit Criteria
|
|
|
|
- A non-programmer can write a Markdown/YAML workflow that extracts data from
|
|
Markdown documents and generates new Markdown outputs.
|
|
- The same workflow is repeatable for identical inputs.
|
|
- Assisted steps are optional and external.
|
|
- Diagnostics identify which source, step, or output failed.
|
|
- The implementation remains compatible with future references/processors,
|
|
cache/provenance, context engines, and access-control policy.
|