markitect-tool/workplans/MKTT-WP-0011-markdown-dataflow-pipeline-workflows.md

---
id: MKTT-WP-0011
type: workplan
title: "Markdown Dataflow Pipeline Workflows"
domain: markitect
status: todo
owner: markitect-tool
topic_slug: markitect
planning_priority: P2
planning_order: 75
depends_on_workplans:
  - MKTT-WP-0003
depends_on_tasks:
  - MKTT-WP-0010-T001
  - MKTT-WP-0010-T005
related_workplans:
  - MKTT-WP-0005
  - MKTT-WP-0006
  - MKTT-WP-0008
  - MKTT-WP-0009
created: "2026-05-04"
updated: "2026-05-04"
state_hub_workstream_id: "ed4c491d-4f81-4df0-af51-5f4bd4d1ad91"
---

# MKTT-WP-0011: Markdown Dataflow Pipeline Workflows

## Purpose

Create a declarative workflow layer for Markdown-to-Markdown dataflow:
collecting data from one or more Markdown sources, applying deterministic and
optional assisted processing, and injecting the results into one or more
Markdown outputs.

## Background

The current toolkit has strong primitives: parse, query, extract, transform,
compose, include, template render, contract stub generation, generation plans,
and a provider-neutral assisted-generation hook.

What is missing is orchestration. Users can script the pieces manually, but
there is not yet a first-class workflow model for:

```text
Markdown sources -> extracted data products -> processors -> generated outputs
```

See `docs/markdown-dataflow-workflow-assessment.md`.

## P11.1 - Define workflow plan model

```task
id: MKTT-WP-0011-T001
status: todo
priority: high
state_hub_task_id: "c335cbaa-dfb9-4df5-b1ae-87aaf6097bd8"
```

Define a Markdown/YAML workflow plan format with sources, named data products,
steps, outputs, variables, dry-run behavior, diagnostics, and provenance.

Output: workflow schema, examples, and validation diagnostics.

## P11.2 - Implement Markdown source collectors

```task
id: MKTT-WP-0011-T002
status: todo
priority: high
state_hub_task_id: "16a89801-d96d-437f-a883-81d09586f47a"
```

Collect source data from files, globs, directories, frontmatter paths,
selectors, sections, blocks, metrics, and future reference/index backends.

Output: source collector API, selector integration, and tests.

## P11.3 - Implement deterministic step registry

```task
id: MKTT-WP-0011-T003
status: todo
priority: high
state_hub_task_id: "808bed93-c7e2-4b34-90f4-f6f961fef503"
```

Create step types for query/extract, transform, compose, include, template
render, contract stub generation, contract checks, and data shaping.

Output: deterministic workflow runner with dependency ordering.

## P11.4 - Implement data expression and binding model

```task
id: MKTT-WP-0011-T004
status: todo
priority: high
state_hub_task_id: "ea1ad9d2-3668-4b65-afb4-f490e5bfd0c6"
```

Allow workflow steps and outputs to reference previous results by stable names,
for example `${sources.adrs.decisions}` or `${steps.summary.markdown}`.

Output: expression resolver, type checks, and missing-reference diagnostics.

## P11.5 - Add optional assisted processing step boundary

```task
id: MKTT-WP-0011-T005
status: todo
priority: medium
state_hub_task_id: "ed1adc60-fdd8-4d4c-b4d7-7ce906e641c6"
```

Add assisted step support through the provider-neutral generation hook protocol.
The workflow engine must not require provider dependencies and must support
dry-run, optional steps, and policy gates before sending data to a provider.

Output: hook adapter interface and tests with fake providers.

## P11.6 - Implement multi-output sinks

```task
id: MKTT-WP-0011-T006
status: todo
priority: high
state_hub_task_id: "902707d7-46fe-45d6-a9ec-b85763065ff9"
```

Support writing one or many Markdown outputs from templates, generated content,
or composed results. Outputs must be path-safe, reproducible, and traceable to
their source data.

Output: output sink API, path-safety checks, and provenance manifests.

## P11.7 - Add workflow CLI

```task
id: MKTT-WP-0011-T007
status: todo
priority: high
state_hub_task_id: "ccc26867-5724-4205-b3fe-a8b9d046775d"
```

Add:

```text
mkt workflow inspect <workflow.md>
mkt workflow plan <workflow.md>
mkt workflow run <workflow.md>
```

Include JSON/YAML outputs for agent use.

## P11.8 - Add representative end-to-end examples

```task
id: MKTT-WP-0011-T008
status: todo
priority: high
state_hub_task_id: "f8501ea6-1ead-477d-8f64-c196e7edfe68"
```

Create examples covering:

- multiple ADRs -> release notes
- contract data -> generated documents
- source snippets -> docs
- deterministic summary -> optional assisted review -> final Markdown

## Exit Criteria

- A non-programmer can write a Markdown/YAML workflow that extracts data from
  Markdown documents and generates new Markdown outputs.
- The same workflow is repeatable for identical inputs.
- Assisted steps are optional and external.
- Diagnostics identify which source, step, or output failed.
- The implementation remains compatible with future references/processors,
  cache/provenance, context engines, and access-control policy.