generated from coulomb/repo-seed
Covers UC-001 through UC-025: project definition, inspect, validate, build (LEVEL1 + LEVEL3), import, drift detection, full round-trip workflows, test harness, evidence/audit trail, template management, Word-first round-trip, REST service, MCP tools, version/health. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
418 lines
12 KiB
Markdown
418 lines
12 KiB
Markdown
# markidocx Tutorial
|
|
|
|
## Overview
|
|
|
|
markidocx is a **Markdown ↔ DOCX round-trip editing system**. Markdown is the canonical source of truth; Word documents are editorial projections used for review. Every operation preserves this asymmetry — edits made in Word flow back into Markdown, not the other way around.
|
|
|
|
All capabilities are available through three equivalent interfaces:
|
|
- **CLI** — local document workflows
|
|
- **REST** — pipeline and automation integration (`markidocx serve`)
|
|
- **MCP** — agent-accessible tools (`markidocx mcp`)
|
|
|
|
---
|
|
|
|
## 1. Define a Project (UC-001)
|
|
|
|
Everything in markidocx starts with a **manifest file** — a YAML declaration of your sources, feature level, and document family.
|
|
|
|
```yaml
|
|
# manifest.yaml
|
|
project:
|
|
name: "Technical Specification"
|
|
feature_level: level1 # or level3 for advanced features
|
|
family: article # article | book | website
|
|
|
|
sources:
|
|
- path: intro.md
|
|
- path: chapters/design.md
|
|
- path: chapters/api.md
|
|
|
|
output:
|
|
dir: ./dist
|
|
```
|
|
|
|
**Feature levels:**
|
|
- `level1` — headings, lists, tables, footnotes, images, links
|
|
- `level3` — everything in LEVEL1, plus cross-references, numbered figures, auto-diagrams (Mermaid/Graphviz/PlantUML), bibliography
|
|
|
|
**Built-in families:**
|
|
|
|
| Family | Description |
|
|
|--------|-------------|
|
|
| `article` | Single-document article layout |
|
|
| `book` | Multi-chapter book layout |
|
|
| `website` | Web-optimised document layout |
|
|
|
|
---
|
|
|
|
## 2. Validate and Inspect (UC-002, UC-003)
|
|
|
|
Before building, confirm the project is well-formed.
|
|
|
|
**Validate** — checks manifest structure, source file existence, family/level compatibility:
|
|
|
|
```bash
|
|
markidocx validate manifest.yaml
|
|
# ✓ Manifest valid: Technical Specification
|
|
|
|
markidocx validate manifest.yaml --json
|
|
# {"status": "ok", "project": "Technical Specification"}
|
|
```
|
|
|
|
**Inspect** — shows the full resolved structure including LEVEL3 capability availability:
|
|
|
|
```bash
|
|
markidocx inspect manifest.yaml
|
|
# Project: Technical Specification
|
|
# family: article
|
|
# feature_level: level1
|
|
# sources: intro.md, chapters/design.md, chapters/api.md
|
|
# level3 xref: False
|
|
# level3 diag: False
|
|
|
|
markidocx inspect manifest.yaml --json
|
|
# {
|
|
# "status": "ok",
|
|
# "project": "Technical Specification",
|
|
# "family": "article",
|
|
# "feature_level": "level1",
|
|
# "sources": ["intro.md", "chapters/design.md", "chapters/api.md"],
|
|
# "level3": {"xref_available": false, "diagrams_available": false, ...}
|
|
# }
|
|
```
|
|
|
|
The `level3` block tells you which optional processors (`mmdc`, `dot`, `plantuml`) are available on your PATH.
|
|
|
|
---
|
|
|
|
## 3. Build a DOCX (UC-004, UC-005, UC-014, UC-015)
|
|
|
|
Compile Markdown sources into a Word document:
|
|
|
|
```bash
|
|
markidocx build manifest.yaml
|
|
# ✓ Built: dist/technical-specification.docx
|
|
|
|
markidocx build manifest.yaml --json
|
|
# {"status": "ok", "output_path": "dist/...", "family": "article", "warnings": []}
|
|
```
|
|
|
|
**Switching families** (UC-005) — change `family:` in the manifest to re-build with different styling. All three built-in families are always available without any setup.
|
|
|
|
**LEVEL3 document** (UC-015) — set `feature_level: level3` and include advanced constructs in your Markdown:
|
|
|
|
```markdown
|
|
<!-- Cross-reference -->
|
|
See [Section 2][sec-design].
|
|
|
|
<!-- Numbered figure -->
|
|

|
|
*Figure 1: System architecture*
|
|
<!-- figure-label: fig-arch -->
|
|
|
|
<!-- Auto-diagram (requires mmdc on PATH) -->
|
|
```mermaid
|
|
graph TD
|
|
A[Client] --> B[API]
|
|
B --> C[Database]
|
|
```
|
|
|
|
<!-- Citation -->
|
|
As noted in [@smith2023], the approach is sound.
|
|
|
|
## References
|
|
|
|
- [@smith2023]: Smith, J. *Technical Approaches*. 2023.
|
|
```
|
|
|
|
If a diagram renderer is unavailable, markidocx falls back to embedding the source as a verbatim code block and emits a warning — **source is never silently discarded**.
|
|
|
|
---
|
|
|
|
## 4. Import an Edited DOCX (UC-006)
|
|
|
|
After a reviewer edits the Word document, import their changes back to Markdown:
|
|
|
|
```bash
|
|
markidocx import manifest.yaml dist/technical-specification-reviewed.docx
|
|
# ✓ Imported (mapped)
|
|
# → intro.md
|
|
# → chapters/design.md
|
|
# → chapters/api.md
|
|
```
|
|
|
|
For **single-file projects** the import produces one `.md` file. For **multi-file projects**, markidocx redistributes content back to the original source files using heading boundaries as guides. If redistribution is ambiguous, it falls back to a single merged file and reports `mapping_status: fallback`.
|
|
|
|
---
|
|
|
|
## 5. Detect Round-Trip Drift (UC-011)
|
|
|
|
After importing, check whether any structure was lost or degraded:
|
|
|
|
```bash
|
|
markidocx compare manifest.yaml dist/technical-specification-reviewed.docx
|
|
# ✓ No drift detected
|
|
# preserved: 12 elements
|
|
|
|
# Or with drift:
|
|
# ⚠ Drift detected
|
|
# degraded: heading:## Background (1/2)
|
|
# broken: footnote:[^1]
|
|
```
|
|
|
|
The drift report classifies every structural element as:
|
|
- **preserved** — identical in original and re-import
|
|
- **degraded** — present but modified
|
|
- **broken** — present in original, missing from re-import
|
|
- **unsupported** — construct not supported at the declared feature level
|
|
|
|
---
|
|
|
|
## 6. Full Round-Trip Workflow (UC-007)
|
|
|
|
The `workflow` command runs the full cycle in one step:
|
|
|
|
```bash
|
|
markidocx workflow single-file-roundtrip manifest.yaml
|
|
# ✓ Workflow single-file-roundtrip: pass
|
|
# ✓ validate: executed
|
|
# ✓ build: executed
|
|
# ✓ import: executed
|
|
# ✓ compare: executed
|
|
# run_id: a3f91c2e-...
|
|
|
|
# Multi-file variant
|
|
markidocx workflow multi-file-roundtrip manifest.yaml
|
|
```
|
|
|
|
Available workflows:
|
|
|
|
| Workflow | Steps |
|
|
|----------|-------|
|
|
| `single-file-roundtrip` | validate → build → import → compare |
|
|
| `multi-file-roundtrip` | validate → build → import → redistribute → compare |
|
|
| `release-regression` | full regression against the stable corpus |
|
|
| `family-switch-build` | build under each of the three built-in families |
|
|
|
|
---
|
|
|
|
## 7. Run the Test Suite (UC-021)
|
|
|
|
Run the end-to-end regression harness for a project:
|
|
|
|
```bash
|
|
markidocx test manifest.yaml
|
|
# ✓ Tests: 4 passed, 0 failed, 0 skipped
|
|
# ✓ validate: executed
|
|
# ✓ build: executed
|
|
# ✓ import: executed
|
|
# ✓ compare: executed
|
|
# run_id: b7d04a1e-...
|
|
|
|
# Exit code 0 on pass, 1 on any failure — CI-friendly
|
|
markidocx test manifest.yaml --json
|
|
```
|
|
|
|
---
|
|
|
|
## 8. Evidence and Audit Trail (UC-025, UC-022)
|
|
|
|
Every `build`, `import`, `compare`, and `workflow` run produces a persistent evidence record keyed by `run_id`.
|
|
|
|
**List recent runs:**
|
|
|
|
```bash
|
|
markidocx evidence list
|
|
markidocx evidence list --limit 5 --json
|
|
```
|
|
|
|
**Retrieve a run's evidence:**
|
|
|
|
```bash
|
|
markidocx evidence get a3f91c2e-...
|
|
# ✓ Run: a3f91c2e-... [pass]
|
|
# Reports: 4
|
|
# Warnings: 0
|
|
# Errors: 0
|
|
# • build (a3f91c2e-…)
|
|
# • import (a3f91c2e-…)
|
|
# • compare (a3f91c2e-…)
|
|
# • validation (a3f91c2e-…)
|
|
|
|
markidocx evidence get a3f91c2e-... --json
|
|
markidocx evidence get a3f91c2e-... --output evidence.json
|
|
```
|
|
|
|
The assembled **EvidenceSet** reports:
|
|
- `classification` — `pass` | `pass-with-warnings` | `failed`
|
|
- `components` — which report types are present
|
|
- `completeness_note` — if expected reports are absent for the workflow type
|
|
|
|
---
|
|
|
|
## 9. Template Management (UC-012, UC-013)
|
|
|
|
**List families:**
|
|
|
|
```bash
|
|
markidocx template list
|
|
markidocx template list --json
|
|
```
|
|
|
|
**List styles in a family** — inspect the actual Word styles available:
|
|
|
|
```bash
|
|
markidocx template styles
|
|
markidocx template styles --family book
|
|
markidocx template styles --family article --json
|
|
# [
|
|
# {"name": "Heading 1", "style_id": "Heading1", "type": "paragraph", ...},
|
|
# {"name": "Normal", "style_id": "Normal", "type": "paragraph", ...},
|
|
# ...
|
|
# ]
|
|
```
|
|
|
|
**Register a custom template:**
|
|
|
|
```bash
|
|
markidocx template register my-brand.docx --name brand --description "Corporate brand"
|
|
```
|
|
|
|
**Extract a template from an existing Word document:**
|
|
|
|
```bash
|
|
markidocx template extract existing-report.docx
|
|
# ✓ Template extracted: existing-report-template.docx
|
|
# Styles preserved: 42
|
|
|
|
markidocx template extract existing-report.docx \
|
|
--template-out corporate-template.docx \
|
|
--family corporate
|
|
```
|
|
|
|
This strips all body content while preserving every style, page setup, header, footer, and theme from the source document. The result is a content-free template ready for use with `markidocx build`.
|
|
|
|
---
|
|
|
|
## 10. Word-First Round-Trip (UC-006 variant)
|
|
|
|
If you have an existing Word document and want to bring it into the markidocx workflow:
|
|
|
|
```bash
|
|
# Step 1: extract the template shell
|
|
markidocx template extract report.docx --template-out report-template.docx
|
|
|
|
# Step 2: import the content to Markdown
|
|
markidocx import manifest.yaml report.docx
|
|
# → content.md
|
|
|
|
# Step 3: edit content.md in your editor, then rebuild
|
|
markidocx build manifest.yaml
|
|
# ✓ Built: dist/report.docx
|
|
|
|
# Step 4: verify zero structural drift from the original
|
|
markidocx compare manifest.yaml dist/report.docx
|
|
# ✓ No drift detected
|
|
```
|
|
|
|
---
|
|
|
|
## 11. REST Service (UC-019)
|
|
|
|
Start the service:
|
|
|
|
```bash
|
|
markidocx serve # production
|
|
markidocx serve --dev --port 8080 # dev mode with auto-reload
|
|
```
|
|
|
|
All CLI operations have REST equivalents:
|
|
|
|
| CLI | REST |
|
|
|-----|------|
|
|
| `validate` | `POST /validate` |
|
|
| `build` | `POST /build` |
|
|
| `import` | `POST /import` |
|
|
| `compare` | `POST /compare` |
|
|
| `workflow` | `POST /workflows/{name}` |
|
|
| `evidence get` | `GET /evidence/{run_id}` |
|
|
| `template list` | `GET /templates` |
|
|
| `template styles` | `GET /styles?family=article` |
|
|
| `template extract` | `POST /template/extract` |
|
|
|
|
**Example — build via REST:**
|
|
|
|
```bash
|
|
curl -X POST http://localhost:8000/build \
|
|
-H "Content-Type: application/json" \
|
|
-d '{
|
|
"manifest_yaml": "project:\n name: Test\n feature_level: level1\n family: article\nsources:\n - path: doc.md\noutput:\n dir: ./dist\n",
|
|
"sources": [{"name": "doc.md", "content": "# Hello\n\nWorld.\n"}]
|
|
}'
|
|
# {"status": "ok", "outputs": {"docx_base64": "..."}, "warnings": []}
|
|
```
|
|
|
|
**Capability and health discovery:**
|
|
|
|
```bash
|
|
curl http://localhost:8000/capabilities
|
|
curl http://localhost:8000/health
|
|
curl http://localhost:8000/version
|
|
```
|
|
|
|
---
|
|
|
|
## 12. MCP Tools (UC-020)
|
|
|
|
Start the MCP server:
|
|
|
|
```bash
|
|
markidocx mcp
|
|
```
|
|
|
|
Available tools (callable by any MCP-compatible agent):
|
|
|
|
| Tool | Description |
|
|
|------|-------------|
|
|
| `validate_project(manifest_yaml)` | Validate a manifest |
|
|
| `inspect_project(manifest_yaml)` | Inspect project structure + capabilities |
|
|
| `build(manifest_yaml, sources)` | Build DOCX, returns `docx_base64` |
|
|
| `import_docx(manifest_yaml, docx_base64)` | Import DOCX to Markdown |
|
|
| `compare(manifest_yaml, docx_base64, sources)` | Drift detection |
|
|
| `run_tests(manifest_yaml, sources)` | End-to-end regression |
|
|
| `invoke_workflow(name, manifest_yaml, sources)` | Named workflow |
|
|
| `get_evidence(run_id)` | Retrieve evidence set |
|
|
| `list_templates()` | Available families |
|
|
| `list_styles(family)` | Styles in a family |
|
|
| `extract_template(source_path, template_out)` | Extract template shell |
|
|
| `get_version()` | Version info |
|
|
|
|
---
|
|
|
|
## 13. Version and Health (UC-024)
|
|
|
|
```bash
|
|
markidocx --version
|
|
# markidocx 0.1.0
|
|
|
|
# Via REST
|
|
curl http://localhost:8000/health
|
|
# {"status": "ok", "version": "0.1.0"}
|
|
```
|
|
|
|
---
|
|
|
|
## Summary: The Core Workflow
|
|
|
|
```
|
|
1. Author writes Markdown → manifest.yaml + *.md files
|
|
2. markidocx inspect → confirm structure and capabilities
|
|
3. markidocx build → dist/document.docx (send to reviewer)
|
|
4. Reviewer edits DOCX → document-reviewed.docx (returned)
|
|
5. markidocx import → Markdown updated with reviewer edits
|
|
6. markidocx compare → drift report confirms what changed
|
|
7. markidocx evidence list → audit trail for every run
|
|
```
|
|
|
|
All three interfaces (CLI, REST, MCP) expose the same functional model. No capability is interface-specific — every operation accessible via the CLI is equally accessible to a pipeline or an agent.
|