Files
markitect-tool/workplans/MKTT-WP-0011-markdown-dataflow-pipeline-workflows.md

4.8 KiB

id, type, title, domain, status, owner, topic_slug, planning_priority, planning_order, depends_on_workplans, depends_on_tasks, related_workplans, created, updated, state_hub_workstream_id
id type title domain status owner topic_slug planning_priority planning_order depends_on_workplans depends_on_tasks related_workplans created updated state_hub_workstream_id
MKTT-WP-0011 workplan Markdown Dataflow Pipeline Workflows markitect todo markitect-tool markitect P2 75
MKTT-WP-0003
MKTT-WP-0010-T001
MKTT-WP-0010-T005
MKTT-WP-0005
MKTT-WP-0006
MKTT-WP-0008
MKTT-WP-0009
2026-05-04 2026-05-04 ed4c491d-4f81-4df0-af51-5f4bd4d1ad91

MKTT-WP-0011: Markdown Dataflow Pipeline Workflows

Purpose

Create a declarative workflow layer for Markdown-to-Markdown dataflow: collecting data from one or more Markdown sources, applying deterministic and optional assisted processing, and injecting the results into one or more Markdown outputs.

Background

The current toolkit has strong primitives: parse, query, extract, transform, compose, include, template render, contract stub generation, generation plans, and a provider-neutral assisted-generation hook.

What is missing is orchestration. Users can script the pieces manually, but there is not yet a first-class workflow model for:

Markdown sources -> extracted data products -> processors -> generated outputs

See docs/markdown-dataflow-workflow-assessment.md.

P11.1 - Define workflow plan model

id: MKTT-WP-0011-T001
status: todo
priority: high
state_hub_task_id: "c335cbaa-dfb9-4df5-b1ae-87aaf6097bd8"

Define a Markdown/YAML workflow plan format with sources, named data products, steps, outputs, variables, dry-run behavior, diagnostics, and provenance.

Output: workflow schema, examples, and validation diagnostics.

P11.2 - Implement Markdown source collectors

id: MKTT-WP-0011-T002
status: todo
priority: high
state_hub_task_id: "16a89801-d96d-437f-a883-81d09586f47a"

Collect source data from files, globs, directories, frontmatter paths, selectors, sections, blocks, metrics, and future reference/index backends.

Output: source collector API, selector integration, and tests.

P11.3 - Implement deterministic step registry

id: MKTT-WP-0011-T003
status: todo
priority: high
state_hub_task_id: "808bed93-c7e2-4b34-90f4-f6f961fef503"

Create step types for query/extract, transform, compose, include, template render, contract stub generation, contract checks, and data shaping.

Output: deterministic workflow runner with dependency ordering.

P11.4 - Implement data expression and binding model

id: MKTT-WP-0011-T004
status: todo
priority: high
state_hub_task_id: "ea1ad9d2-3668-4b65-afb4-f490e5bfd0c6"

Allow workflow steps and outputs to reference previous results by stable names, for example ${sources.adrs.decisions} or ${steps.summary.markdown}.

Output: expression resolver, type checks, and missing-reference diagnostics.

P11.5 - Add optional assisted processing step boundary

id: MKTT-WP-0011-T005
status: todo
priority: medium
state_hub_task_id: "ed1adc60-fdd8-4d4c-b4d7-7ce906e641c6"

Add assisted step support through the provider-neutral generation hook protocol. The workflow engine must not require provider dependencies and must support dry-run, optional steps, and policy gates before sending data to a provider.

Output: hook adapter interface and tests with fake providers.

P11.6 - Implement multi-output sinks

id: MKTT-WP-0011-T006
status: todo
priority: high
state_hub_task_id: "902707d7-46fe-45d6-a9ec-b85763065ff9"

Support writing one or many Markdown outputs from templates, generated content, or composed results. Outputs must be path-safe, reproducible, and traceable to their source data.

Output: output sink API, path-safety checks, and provenance manifests.

P11.7 - Add workflow CLI

id: MKTT-WP-0011-T007
status: todo
priority: high
state_hub_task_id: "ccc26867-5724-4205-b3fe-a8b9d046775d"

Add:

mkt workflow inspect <workflow.md>
mkt workflow plan <workflow.md>
mkt workflow run <workflow.md>

Include JSON/YAML outputs for agent use.

P11.8 - Add representative end-to-end examples

id: MKTT-WP-0011-T008
status: todo
priority: high
state_hub_task_id: "f8501ea6-1ead-477d-8f64-c196e7edfe68"

Create examples covering:

  • multiple ADRs -> release notes
  • contract data -> generated documents
  • source snippets -> docs
  • deterministic summary -> optional assisted review -> final Markdown

Exit Criteria

  • A non-programmer can write a Markdown/YAML workflow that extracts data from Markdown documents and generates new Markdown outputs.
  • The same workflow is repeatable for identical inputs.
  • Assisted steps are optional and external.
  • Diagnostics identify which source, step, or output failed.
  • The implementation remains compatible with future references/processors, cache/provenance, context engines, and access-control policy.