diff --git a/docs/tutorial.md b/docs/tutorial.md new file mode 100644 index 0000000..d20ee22 --- /dev/null +++ b/docs/tutorial.md @@ -0,0 +1,417 @@ +# markidocx Tutorial + +## Overview + +markidocx is a **Markdown ↔ DOCX round-trip editing system**. Markdown is the canonical source of truth; Word documents are editorial projections used for review. Every operation preserves this asymmetry — edits made in Word flow back into Markdown, not the other way around. + +All capabilities are available through three equivalent interfaces: +- **CLI** — local document workflows +- **REST** — pipeline and automation integration (`markidocx serve`) +- **MCP** — agent-accessible tools (`markidocx mcp`) + +--- + +## 1. Define a Project (UC-001) + +Everything in markidocx starts with a **manifest file** — a YAML declaration of your sources, feature level, and document family. + +```yaml +# manifest.yaml +project: + name: "Technical Specification" + feature_level: level1 # or level3 for advanced features + family: article # article | book | website + +sources: + - path: intro.md + - path: chapters/design.md + - path: chapters/api.md + +output: + dir: ./dist +``` + +**Feature levels:** +- `level1` — headings, lists, tables, footnotes, images, links +- `level3` — everything in LEVEL1, plus cross-references, numbered figures, auto-diagrams (Mermaid/Graphviz/PlantUML), bibliography + +**Built-in families:** + +| Family | Description | +|--------|-------------| +| `article` | Single-document article layout | +| `book` | Multi-chapter book layout | +| `website` | Web-optimised document layout | + +--- + +## 2. Validate and Inspect (UC-002, UC-003) + +Before building, confirm the project is well-formed. + +**Validate** — checks manifest structure, source file existence, family/level compatibility: + +```bash +markidocx validate manifest.yaml +# ✓ Manifest valid: Technical Specification + +markidocx validate manifest.yaml --json +# {"status": "ok", "project": "Technical Specification"} +``` + +**Inspect** — shows the full resolved structure including LEVEL3 capability availability: + +```bash +markidocx inspect manifest.yaml +# Project: Technical Specification +# family: article +# feature_level: level1 +# sources: intro.md, chapters/design.md, chapters/api.md +# level3 xref: False +# level3 diag: False + +markidocx inspect manifest.yaml --json +# { +# "status": "ok", +# "project": "Technical Specification", +# "family": "article", +# "feature_level": "level1", +# "sources": ["intro.md", "chapters/design.md", "chapters/api.md"], +# "level3": {"xref_available": false, "diagrams_available": false, ...} +# } +``` + +The `level3` block tells you which optional processors (`mmdc`, `dot`, `plantuml`) are available on your PATH. + +--- + +## 3. Build a DOCX (UC-004, UC-005, UC-014, UC-015) + +Compile Markdown sources into a Word document: + +```bash +markidocx build manifest.yaml +# ✓ Built: dist/technical-specification.docx + +markidocx build manifest.yaml --json +# {"status": "ok", "output_path": "dist/...", "family": "article", "warnings": []} +``` + +**Switching families** (UC-005) — change `family:` in the manifest to re-build with different styling. All three built-in families are always available without any setup. + +**LEVEL3 document** (UC-015) — set `feature_level: level3` and include advanced constructs in your Markdown: + +```markdown + +See [Section 2][sec-design]. + + +![Architecture diagram](arch.png) +*Figure 1: System architecture* + + + +```mermaid +graph TD + A[Client] --> B[API] + B --> C[Database] +``` + + +As noted in [@smith2023], the approach is sound. + +## References + +- [@smith2023]: Smith, J. *Technical Approaches*. 2023. +``` + +If a diagram renderer is unavailable, markidocx falls back to embedding the source as a verbatim code block and emits a warning — **source is never silently discarded**. + +--- + +## 4. Import an Edited DOCX (UC-006) + +After a reviewer edits the Word document, import their changes back to Markdown: + +```bash +markidocx import manifest.yaml dist/technical-specification-reviewed.docx +# ✓ Imported (mapped) +# → intro.md +# → chapters/design.md +# → chapters/api.md +``` + +For **single-file projects** the import produces one `.md` file. For **multi-file projects**, markidocx redistributes content back to the original source files using heading boundaries as guides. If redistribution is ambiguous, it falls back to a single merged file and reports `mapping_status: fallback`. + +--- + +## 5. Detect Round-Trip Drift (UC-011) + +After importing, check whether any structure was lost or degraded: + +```bash +markidocx compare manifest.yaml dist/technical-specification-reviewed.docx +# ✓ No drift detected +# preserved: 12 elements + +# Or with drift: +# ⚠ Drift detected +# degraded: heading:## Background (1/2) +# broken: footnote:[^1] +``` + +The drift report classifies every structural element as: +- **preserved** — identical in original and re-import +- **degraded** — present but modified +- **broken** — present in original, missing from re-import +- **unsupported** — construct not supported at the declared feature level + +--- + +## 6. Full Round-Trip Workflow (UC-007) + +The `workflow` command runs the full cycle in one step: + +```bash +markidocx workflow single-file-roundtrip manifest.yaml +# ✓ Workflow single-file-roundtrip: pass +# ✓ validate: executed +# ✓ build: executed +# ✓ import: executed +# ✓ compare: executed +# run_id: a3f91c2e-... + +# Multi-file variant +markidocx workflow multi-file-roundtrip manifest.yaml +``` + +Available workflows: + +| Workflow | Steps | +|----------|-------| +| `single-file-roundtrip` | validate → build → import → compare | +| `multi-file-roundtrip` | validate → build → import → redistribute → compare | +| `release-regression` | full regression against the stable corpus | +| `family-switch-build` | build under each of the three built-in families | + +--- + +## 7. Run the Test Suite (UC-021) + +Run the end-to-end regression harness for a project: + +```bash +markidocx test manifest.yaml +# ✓ Tests: 4 passed, 0 failed, 0 skipped +# ✓ validate: executed +# ✓ build: executed +# ✓ import: executed +# ✓ compare: executed +# run_id: b7d04a1e-... + +# Exit code 0 on pass, 1 on any failure — CI-friendly +markidocx test manifest.yaml --json +``` + +--- + +## 8. Evidence and Audit Trail (UC-025, UC-022) + +Every `build`, `import`, `compare`, and `workflow` run produces a persistent evidence record keyed by `run_id`. + +**List recent runs:** + +```bash +markidocx evidence list +markidocx evidence list --limit 5 --json +``` + +**Retrieve a run's evidence:** + +```bash +markidocx evidence get a3f91c2e-... +# ✓ Run: a3f91c2e-... [pass] +# Reports: 4 +# Warnings: 0 +# Errors: 0 +# • build (a3f91c2e-…) +# • import (a3f91c2e-…) +# • compare (a3f91c2e-…) +# • validation (a3f91c2e-…) + +markidocx evidence get a3f91c2e-... --json +markidocx evidence get a3f91c2e-... --output evidence.json +``` + +The assembled **EvidenceSet** reports: +- `classification` — `pass` | `pass-with-warnings` | `failed` +- `components` — which report types are present +- `completeness_note` — if expected reports are absent for the workflow type + +--- + +## 9. Template Management (UC-012, UC-013) + +**List families:** + +```bash +markidocx template list +markidocx template list --json +``` + +**List styles in a family** — inspect the actual Word styles available: + +```bash +markidocx template styles +markidocx template styles --family book +markidocx template styles --family article --json +# [ +# {"name": "Heading 1", "style_id": "Heading1", "type": "paragraph", ...}, +# {"name": "Normal", "style_id": "Normal", "type": "paragraph", ...}, +# ... +# ] +``` + +**Register a custom template:** + +```bash +markidocx template register my-brand.docx --name brand --description "Corporate brand" +``` + +**Extract a template from an existing Word document:** + +```bash +markidocx template extract existing-report.docx +# ✓ Template extracted: existing-report-template.docx +# Styles preserved: 42 + +markidocx template extract existing-report.docx \ + --template-out corporate-template.docx \ + --family corporate +``` + +This strips all body content while preserving every style, page setup, header, footer, and theme from the source document. The result is a content-free template ready for use with `markidocx build`. + +--- + +## 10. Word-First Round-Trip (UC-006 variant) + +If you have an existing Word document and want to bring it into the markidocx workflow: + +```bash +# Step 1: extract the template shell +markidocx template extract report.docx --template-out report-template.docx + +# Step 2: import the content to Markdown +markidocx import manifest.yaml report.docx +# → content.md + +# Step 3: edit content.md in your editor, then rebuild +markidocx build manifest.yaml +# ✓ Built: dist/report.docx + +# Step 4: verify zero structural drift from the original +markidocx compare manifest.yaml dist/report.docx +# ✓ No drift detected +``` + +--- + +## 11. REST Service (UC-019) + +Start the service: + +```bash +markidocx serve # production +markidocx serve --dev --port 8080 # dev mode with auto-reload +``` + +All CLI operations have REST equivalents: + +| CLI | REST | +|-----|------| +| `validate` | `POST /validate` | +| `build` | `POST /build` | +| `import` | `POST /import` | +| `compare` | `POST /compare` | +| `workflow` | `POST /workflows/{name}` | +| `evidence get` | `GET /evidence/{run_id}` | +| `template list` | `GET /templates` | +| `template styles` | `GET /styles?family=article` | +| `template extract` | `POST /template/extract` | + +**Example — build via REST:** + +```bash +curl -X POST http://localhost:8000/build \ + -H "Content-Type: application/json" \ + -d '{ + "manifest_yaml": "project:\n name: Test\n feature_level: level1\n family: article\nsources:\n - path: doc.md\noutput:\n dir: ./dist\n", + "sources": [{"name": "doc.md", "content": "# Hello\n\nWorld.\n"}] + }' +# {"status": "ok", "outputs": {"docx_base64": "..."}, "warnings": []} +``` + +**Capability and health discovery:** + +```bash +curl http://localhost:8000/capabilities +curl http://localhost:8000/health +curl http://localhost:8000/version +``` + +--- + +## 12. MCP Tools (UC-020) + +Start the MCP server: + +```bash +markidocx mcp +``` + +Available tools (callable by any MCP-compatible agent): + +| Tool | Description | +|------|-------------| +| `validate_project(manifest_yaml)` | Validate a manifest | +| `inspect_project(manifest_yaml)` | Inspect project structure + capabilities | +| `build(manifest_yaml, sources)` | Build DOCX, returns `docx_base64` | +| `import_docx(manifest_yaml, docx_base64)` | Import DOCX to Markdown | +| `compare(manifest_yaml, docx_base64, sources)` | Drift detection | +| `run_tests(manifest_yaml, sources)` | End-to-end regression | +| `invoke_workflow(name, manifest_yaml, sources)` | Named workflow | +| `get_evidence(run_id)` | Retrieve evidence set | +| `list_templates()` | Available families | +| `list_styles(family)` | Styles in a family | +| `extract_template(source_path, template_out)` | Extract template shell | +| `get_version()` | Version info | + +--- + +## 13. Version and Health (UC-024) + +```bash +markidocx --version +# markidocx 0.1.0 + +# Via REST +curl http://localhost:8000/health +# {"status": "ok", "version": "0.1.0"} +``` + +--- + +## Summary: The Core Workflow + +``` +1. Author writes Markdown → manifest.yaml + *.md files +2. markidocx inspect → confirm structure and capabilities +3. markidocx build → dist/document.docx (send to reviewer) +4. Reviewer edits DOCX → document-reviewed.docx (returned) +5. markidocx import → Markdown updated with reviewer edits +6. markidocx compare → drift report confirms what changed +7. markidocx evidence list → audit trail for every run +``` + +All three interfaces (CLI, REST, MCP) expose the same functional model. No capability is interface-specific — every operation accessible via the CLI is equally accessible to a pipeline or an agent.