Covers UC-001 through UC-025: project definition, inspect, validate, build (LEVEL1 + LEVEL3), import, drift detection, full round-trip workflows, test harness, evidence/audit trail, template management, Word-first round-trip, REST service, MCP tools, version/health. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
12 KiB
markidocx Tutorial
Overview
markidocx is a Markdown ↔ DOCX round-trip editing system. Markdown is the canonical source of truth; Word documents are editorial projections used for review. Every operation preserves this asymmetry — edits made in Word flow back into Markdown, not the other way around.
All capabilities are available through three equivalent interfaces:
- CLI — local document workflows
- REST — pipeline and automation integration (
markidocx serve) - MCP — agent-accessible tools (
markidocx mcp)
1. Define a Project (UC-001)
Everything in markidocx starts with a manifest file — a YAML declaration of your sources, feature level, and document family.
# manifest.yaml
project:
name: "Technical Specification"
feature_level: level1 # or level3 for advanced features
family: article # article | book | website
sources:
- path: intro.md
- path: chapters/design.md
- path: chapters/api.md
output:
dir: ./dist
Feature levels:
level1— headings, lists, tables, footnotes, images, linkslevel3— everything in LEVEL1, plus cross-references, numbered figures, auto-diagrams (Mermaid/Graphviz/PlantUML), bibliography
Built-in families:
| Family | Description |
|---|---|
article |
Single-document article layout |
book |
Multi-chapter book layout |
website |
Web-optimised document layout |
2. Validate and Inspect (UC-002, UC-003)
Before building, confirm the project is well-formed.
Validate — checks manifest structure, source file existence, family/level compatibility:
markidocx validate manifest.yaml
# ✓ Manifest valid: Technical Specification
markidocx validate manifest.yaml --json
# {"status": "ok", "project": "Technical Specification"}
Inspect — shows the full resolved structure including LEVEL3 capability availability:
markidocx inspect manifest.yaml
# Project: Technical Specification
# family: article
# feature_level: level1
# sources: intro.md, chapters/design.md, chapters/api.md
# level3 xref: False
# level3 diag: False
markidocx inspect manifest.yaml --json
# {
# "status": "ok",
# "project": "Technical Specification",
# "family": "article",
# "feature_level": "level1",
# "sources": ["intro.md", "chapters/design.md", "chapters/api.md"],
# "level3": {"xref_available": false, "diagrams_available": false, ...}
# }
The level3 block tells you which optional processors (mmdc, dot, plantuml) are available on your PATH.
3. Build a DOCX (UC-004, UC-005, UC-014, UC-015)
Compile Markdown sources into a Word document:
markidocx build manifest.yaml
# ✓ Built: dist/technical-specification.docx
markidocx build manifest.yaml --json
# {"status": "ok", "output_path": "dist/...", "family": "article", "warnings": []}
Switching families (UC-005) — change family: in the manifest to re-build with different styling. All three built-in families are always available without any setup.
LEVEL3 document (UC-015) — set feature_level: level3 and include advanced constructs in your Markdown:
<!-- Cross-reference -->
See [Section 2][sec-design].
<!-- Numbered figure -->

*Figure 1: System architecture*
<!-- figure-label: fig-arch -->
<!-- Auto-diagram (requires mmdc on PATH) -->
```mermaid
graph TD
A[Client] --> B[API]
B --> C[Database]
As noted in [@smith2023], the approach is sound.
References
- [@smith2023]: Smith, J. Technical Approaches. 2023.
If a diagram renderer is unavailable, markidocx falls back to embedding the source as a verbatim code block and emits a warning — **source is never silently discarded**.
---
## 4. Import an Edited DOCX (UC-006)
After a reviewer edits the Word document, import their changes back to Markdown:
```bash
markidocx import manifest.yaml dist/technical-specification-reviewed.docx
# ✓ Imported (mapped)
# → intro.md
# → chapters/design.md
# → chapters/api.md
For single-file projects the import produces one .md file. For multi-file projects, markidocx redistributes content back to the original source files using heading boundaries as guides. If redistribution is ambiguous, it falls back to a single merged file and reports mapping_status: fallback.
5. Detect Round-Trip Drift (UC-011)
After importing, check whether any structure was lost or degraded:
markidocx compare manifest.yaml dist/technical-specification-reviewed.docx
# ✓ No drift detected
# preserved: 12 elements
# Or with drift:
# ⚠ Drift detected
# degraded: heading:## Background (1/2)
# broken: footnote:[^1]
The drift report classifies every structural element as:
- preserved — identical in original and re-import
- degraded — present but modified
- broken — present in original, missing from re-import
- unsupported — construct not supported at the declared feature level
6. Full Round-Trip Workflow (UC-007)
The workflow command runs the full cycle in one step:
markidocx workflow single-file-roundtrip manifest.yaml
# ✓ Workflow single-file-roundtrip: pass
# ✓ validate: executed
# ✓ build: executed
# ✓ import: executed
# ✓ compare: executed
# run_id: a3f91c2e-...
# Multi-file variant
markidocx workflow multi-file-roundtrip manifest.yaml
Available workflows:
| Workflow | Steps |
|---|---|
single-file-roundtrip |
validate → build → import → compare |
multi-file-roundtrip |
validate → build → import → redistribute → compare |
release-regression |
full regression against the stable corpus |
family-switch-build |
build under each of the three built-in families |
7. Run the Test Suite (UC-021)
Run the end-to-end regression harness for a project:
markidocx test manifest.yaml
# ✓ Tests: 4 passed, 0 failed, 0 skipped
# ✓ validate: executed
# ✓ build: executed
# ✓ import: executed
# ✓ compare: executed
# run_id: b7d04a1e-...
# Exit code 0 on pass, 1 on any failure — CI-friendly
markidocx test manifest.yaml --json
8. Evidence and Audit Trail (UC-025, UC-022)
Every build, import, compare, and workflow run produces a persistent evidence record keyed by run_id.
List recent runs:
markidocx evidence list
markidocx evidence list --limit 5 --json
Retrieve a run's evidence:
markidocx evidence get a3f91c2e-...
# ✓ Run: a3f91c2e-... [pass]
# Reports: 4
# Warnings: 0
# Errors: 0
# • build (a3f91c2e-…)
# • import (a3f91c2e-…)
# • compare (a3f91c2e-…)
# • validation (a3f91c2e-…)
markidocx evidence get a3f91c2e-... --json
markidocx evidence get a3f91c2e-... --output evidence.json
The assembled EvidenceSet reports:
classification—pass|pass-with-warnings|failedcomponents— which report types are presentcompleteness_note— if expected reports are absent for the workflow type
9. Template Management (UC-012, UC-013)
List families:
markidocx template list
markidocx template list --json
List styles in a family — inspect the actual Word styles available:
markidocx template styles
markidocx template styles --family book
markidocx template styles --family article --json
# [
# {"name": "Heading 1", "style_id": "Heading1", "type": "paragraph", ...},
# {"name": "Normal", "style_id": "Normal", "type": "paragraph", ...},
# ...
# ]
Register a custom template:
markidocx template register my-brand.docx --name brand --description "Corporate brand"
Extract a template from an existing Word document:
markidocx template extract existing-report.docx
# ✓ Template extracted: existing-report-template.docx
# Styles preserved: 42
markidocx template extract existing-report.docx \
--template-out corporate-template.docx \
--family corporate
This strips all body content while preserving every style, page setup, header, footer, and theme from the source document. The result is a content-free template ready for use with markidocx build.
10. Word-First Round-Trip (UC-006 variant)
If you have an existing Word document and want to bring it into the markidocx workflow:
# Step 1: extract the template shell
markidocx template extract report.docx --template-out report-template.docx
# Step 2: import the content to Markdown
markidocx import manifest.yaml report.docx
# → content.md
# Step 3: edit content.md in your editor, then rebuild
markidocx build manifest.yaml
# ✓ Built: dist/report.docx
# Step 4: verify zero structural drift from the original
markidocx compare manifest.yaml dist/report.docx
# ✓ No drift detected
11. REST Service (UC-019)
Start the service:
markidocx serve # production
markidocx serve --dev --port 8080 # dev mode with auto-reload
All CLI operations have REST equivalents:
| CLI | REST |
|---|---|
validate |
POST /validate |
build |
POST /build |
import |
POST /import |
compare |
POST /compare |
workflow |
POST /workflows/{name} |
evidence get |
GET /evidence/{run_id} |
template list |
GET /templates |
template styles |
GET /styles?family=article |
template extract |
POST /template/extract |
Example — build via REST:
curl -X POST http://localhost:8000/build \
-H "Content-Type: application/json" \
-d '{
"manifest_yaml": "project:\n name: Test\n feature_level: level1\n family: article\nsources:\n - path: doc.md\noutput:\n dir: ./dist\n",
"sources": [{"name": "doc.md", "content": "# Hello\n\nWorld.\n"}]
}'
# {"status": "ok", "outputs": {"docx_base64": "..."}, "warnings": []}
Capability and health discovery:
curl http://localhost:8000/capabilities
curl http://localhost:8000/health
curl http://localhost:8000/version
12. MCP Tools (UC-020)
Start the MCP server:
markidocx mcp
Available tools (callable by any MCP-compatible agent):
| Tool | Description |
|---|---|
validate_project(manifest_yaml) |
Validate a manifest |
inspect_project(manifest_yaml) |
Inspect project structure + capabilities |
build(manifest_yaml, sources) |
Build DOCX, returns docx_base64 |
import_docx(manifest_yaml, docx_base64) |
Import DOCX to Markdown |
compare(manifest_yaml, docx_base64, sources) |
Drift detection |
run_tests(manifest_yaml, sources) |
End-to-end regression |
invoke_workflow(name, manifest_yaml, sources) |
Named workflow |
get_evidence(run_id) |
Retrieve evidence set |
list_templates() |
Available families |
list_styles(family) |
Styles in a family |
extract_template(source_path, template_out) |
Extract template shell |
get_version() |
Version info |
13. Version and Health (UC-024)
markidocx --version
# markidocx 0.1.0
# Via REST
curl http://localhost:8000/health
# {"status": "ok", "version": "0.1.0"}
Summary: The Core Workflow
1. Author writes Markdown → manifest.yaml + *.md files
2. markidocx inspect → confirm structure and capabilities
3. markidocx build → dist/document.docx (send to reviewer)
4. Reviewer edits DOCX → document-reviewed.docx (returned)
5. markidocx import → Markdown updated with reviewer edits
6. markidocx compare → drift report confirms what changed
7. markidocx evidence list → audit trail for every run
All three interfaces (CLI, REST, MCP) expose the same functional model. No capability is interface-specific — every operation accessible via the CLI is equally accessible to a pipeline or an agent.