diff --git a/workplans/MRKD-WP-0007-interface-completeness-evidence.md b/workplans/MRKD-WP-0007-interface-completeness-evidence.md new file mode 100644 index 0000000..1149579 --- /dev/null +++ b/workplans/MRKD-WP-0007-interface-completeness-evidence.md @@ -0,0 +1,388 @@ +--- +id: MRKD-WP-0007 +type: workplan +domain: markitect +repo: marki-docx +status: active +state_hub_workstream_id: 61701224-0813-4258-9308-025bcec41780 +created: 2026-03-17 +updated: 2026-03-17 +--- + +# MRKD-WP-0007 — Interface Completeness & Evidence + +Close the remaining FRS v0.2 gaps identified after WP-0001 through WP-0006. +The system is ~92% complete; this workplan brings it to full FRS coverage. + +Three clusters of functional gaps plus one new capability: + +1. **CLI parity** — `inspect`, `test`, and `evidence` commands exist in REST and MCP + but are absent from the CLI (FR-806, FR-810, FR-1409) +2. **Style listing stub** — `GET /styles` and MCP `list_styles` return `[]`; real + style metadata enumeration is needed (FR-907) +3. **Evidence assembly** — individual reports exist but the release evidence set + has no unified aggregation or completeness disclosure (FR-1406–1408, FR-1413) +4. **LEVEL3 edge-case coverage** — core paths are tested; targeted tests needed for + diagram source mutation, bibliography ambiguity, and processor-dependency matrix + (FR-534, FR-538, FR-542) +5. **Word-first round-trip** — new end-to-end capability: extract content and a + content-free template from an existing DOCX, then verify that MD + template → DOCX + reproduces the original document + +**Scope:** FR-806, FR-810, FR-907, FR-1409, FR-1406–1408, FR-1413, +FR-534, FR-538, FR-542, new template-extraction capability +**Out of scope:** new document families, non-DOCX output formats +**Depends on:** WP-0001 through WP-0006 — all complete + +--- + +## T01 — Add `markidocx inspect` and `markidocx test` CLI commands + +```task +id: MRKD-WP-0007-T01 +status: todo +priority: high +state_hub_task_id: f77db529-b17b-4462-a704-2b9a3dbdc892 +``` + +The underlying logic for both commands already exists and is exposed via MCP +(`inspect_project`, `run_tests`) and REST. This task wires them into the CLI. + +**`markidocx inspect `** (FR-806) +- Calls the same project-inspection logic as `inspect_project` in MCP +- Outputs: source files, feature level, template family, detected LEVEL3 constructs, + capability disclosure (which renderers/processors are available) +- `--json` flag: machine-readable output +- Mirrors the REST `GET /inspect` response structure + +**`markidocx test `** (FR-810) +- Runs the regression test suite for the project (same as MCP `run_tests`) +- Outputs: pass/fail counts, skipped tests, any failures with locations +- `--json` flag: machine-readable output +- Exit code 0 on pass, 1 on any failure + +Implementation notes: +- Add `@app.command()` entries in `cli.py`; delegate to existing logic in + `builder.py` / `level3.py` / workflows +- Update `test_interface_parity.py` to assert CLI/REST/MCP parity for both commands +- Add unit tests in `tests/test_cli_inspect_test.py` + +Deliverable: `markidocx inspect ` and `markidocx test ` work; +interface parity tests pass. + +--- + +## T02 — Add `markidocx evidence` CLI command + +```task +id: MRKD-WP-0007-T02 +status: todo +priority: high +state_hub_task_id: 0af8c5bb-c01b-48cf-9895-f6c8033b0606 +``` + +Evidence retrieval is exposed via REST (`GET /evidence/{run_id}`) and MCP +(`get_evidence`) but has no CLI surface (FR-1409, FR-814). + +**`markidocx evidence `** +- Accepts a `run_id` (returned by `build`, `import`, `compare`, `workflow`) +- Retrieves the full evidence record from the evidence store +- Outputs: human-readable summary (validation result, warnings, drift counts, + overall pass/warn/fail status) +- `--json` flag: full machine-readable evidence record +- `--output ` flag: write evidence JSON to file + +**`markidocx evidence list`** (subcommand) +- Lists run IDs available in the evidence store, newest first +- `--limit N` (default 10) +- `--json` flag + +Implementation notes: +- Extend `cli.py` with an `evidence` group using `typer.Typer()` +- Delegates to `evidence.py` store +- Add `--run-id` output to existing `build`, `import`, `compare` commands so the + user knows what ID to retrieve (currently run_id is only in JSON output) +- Update `test_interface_parity.py` to assert parity + +Deliverable: `markidocx evidence ` and `markidocx evidence list` work and +are parity-tested against REST and MCP. + +--- + +## T03 — Implement style listing (replace stub in REST and MCP) + +```task +id: MRKD-WP-0007-T03 +status: todo +priority: medium +state_hub_task_id: e26c824c-868f-470e-bdfc-e1ae18aa7ebe +``` + +`GET /styles` (FR-907) and MCP `list_styles` both return `[]`. The template +files (`.docx`) already contain named paragraph and character styles; they just +need to be enumerated. + +**Style metadata model:** +```python +@dataclass +class StyleEntry: + name: str # e.g. "Heading 1", "Body Text" + style_id: str # Word's internal ID, e.g. "Heading1" + type: str # "paragraph" | "character" | "table" | "numbering" + family: str # template family this style belongs to, e.g. "article" + built_in: bool # True if a Word built-in style +``` + +**`list_styles(family: str | None) -> list[StyleEntry]`** in `templates.py`: +- Opens the template DOCX for the given family (or default) +- Enumerates all styles via `python-docx`'s `document.styles` +- Returns `StyleEntry` list sorted by type then name + +**Wire into interfaces:** +- REST `GET /styles?family=article` → `list[StyleEntry]` as JSON +- MCP `list_styles(family=...)` → same +- CLI `markidocx template styles [--family article]` → tabular output (already has + `template_app` Typer sub-app) + +**Tests:** +- `test_templates.py`: assert at least the standard heading/body styles are present + for each built-in family +- Interface parity test: REST, MCP, CLI all return the same set for the same family + +Deliverable: `markidocx template styles`, `GET /styles`, `list_styles()` return real +style data for all three built-in families. + +--- + +## T04 — Strengthen evidence assembly — unified status summary and composition disclosure + +```task +id: MRKD-WP-0007-T04 +status: todo +priority: medium +state_hub_task_id: d9ef5925-f70f-4e97-a2d4-6932c4c531d6 +``` + +Individual evidence records (validation, build, import, drift) exist but there is no +formal aggregation into a release evidence set (FR-1406–1408, FR-1413). + +**Release evidence set structure** (`EvidenceSet` in `evidence.py`): +```python +@dataclass +class EvidenceSet: + run_id: str + created_at: str + manifest_path: str + components: list[str] # which reports are present (FR-1407) + overall_status: str # "pass" | "pass-with-warnings" | "fail" (FR-1408) + validation_result: ... | None + build_result: ... | None + import_result: ... | None + drift_result: ... | None + warnings: list[WarningRecord] # aggregated across all components + completeness_note: str | None # which expected components are absent (FR-1413) +``` + +**`assemble_evidence_set(run_id: str) -> EvidenceSet`**: +- Reads all component records for the run from the evidence store +- Derives `overall_status`: `fail` if any component failed; `pass-with-warnings` if + any warnings exist; `pass` otherwise +- Sets `completeness_note` if expected components are absent for the workflow type + (e.g. a roundtrip workflow should have build + import + drift; if drift is absent, + note it) +- Enumerates `components` list (FR-1407) + +**Wire into interfaces:** +- REST `GET /evidence/{run_id}` → return `EvidenceSet` instead of raw record +- MCP `get_evidence(run_id)` → same +- CLI `markidocx evidence ` → display `EvidenceSet` summary +- Workflow commands: assemble and persist the evidence set at workflow completion + +**Tests:** +- `test_evidence.py`: assert `assemble_evidence_set` returns correct `overall_status` + for pass / pass-with-warnings / fail scenarios +- Assert `components` enumeration is accurate +- Assert `completeness_note` fires when a component is absent + +Deliverable: All three interfaces return a coherent `EvidenceSet` with `overall_status`, +`components`, and `completeness_note`. Existing evidence tests still pass. + +--- + +## T05 — LEVEL3 edge-case coverage + +```task +id: MRKD-WP-0007-T05 +status: todo +priority: low +state_hub_task_id: 20789d1c-4495-468f-bbb7-912e63e804e4 +``` + +Core LEVEL3 paths are tested; this task adds targeted tests for three +undertested edge-case areas. + +**FR-534 — Diagram source mutation on round-trip** +- Test: build a DOCX with a mermaid block; manually alter the alt-text source marker + in the DOCX (simulate editorial mutation of the embedded diagram); import → assert + that `differ.py` classifies the change as `structural` (not silently dropped) +- Test: assert that a diagram block with empty source produces a `WarningRecord` + +**FR-538 — Processor dependency version matrix** +- Test: mock `shutil.which("mmdc")` to return a path; mock the subprocess call to + return `mmdc --version` → `"10.x.x"` (supported) vs. `"8.x.x"` (too old) +- Assert that an outdated renderer produces `WarningRecord(reason="renderer-version-unsupported")` + rather than silently falling back (requires adding version-check logic to renderer + backends in `diagrams.py` if not already present) +- If version-checking is not yet in `diagrams.py`, add it as part of this task + +**FR-542 — Bibliography ambiguity edge cases** +- Test: document with two citations sharing the same key → assert `WarningRecord` +- Test: document with a citation key that has no corresponding reference entry → + assert `WarningRecord(reason="citation-key-missing")` +- Test: round-trip of a references section with special characters in author names + +**Tests location:** extend `tests/test_level3_diagrams.py`, `tests/test_level3_bibliography.py` + +Deliverable: All three edge-case areas have at least two targeted tests each. +Existing LEVEL3 tests still pass. + +--- + +## T06 — End-to-end Word-first round-trip: template extraction and rebuild verification + +```task +id: MRKD-WP-0007-T06 +status: todo +priority: high +state_hub_task_id: 0c16c598-bd49-4721-89a3-e989e1d36879 +``` + +This task delivers a new capability: given an existing Word document as the starting +point, marki-docx can decompose it into a Markdown content file and a content-free +DOCX template, and then verify that recombining the two recreates the original document. + +This closes the loop on the round-trip: the existing flow is MD → DOCX → MD; this +adds DOCX → (MD + template) → DOCX, making Word-authored documents first-class inputs. + +### New command: `markidocx template extract ` + +Extracts the structural and stylistic shell of `source.docx` — keeping all styles, +page setup, headers/footers, section properties, and theme data — while removing all +body content (paragraphs, tables, figures, etc.). + +``` +markidocx template extract \ + [--template-out ] # default: -template.docx + [--content-out ] # default: .md (runs import) + [--family ] # register extracted template under this family name + [--json] +``` + +**Outputs:** +1. `` — the content-free shell (styles preserved, body empty) +2. `` — the Markdown content extracted via the existing `import` path + +**Implementation in `templates.py`:** +```python +def extract_template(source_path: Path, template_out: Path) -> TemplateExtractionResult: + """ + Open source_path with python-docx. Copy all styles, page setup, + headers/footers, and theme. Clear the document body (remove all + paragraphs and tables). Save to template_out. + """ +``` + +`TemplateExtractionResult`: +```python +@dataclass +class TemplateExtractionResult: + template_path: Path + styles_preserved: int # count of styles copied + warnings: list[WarningRecord] +``` + +**Wire into CLI:** +- `template_app` already exists in `cli.py`; add `extract` subcommand +- After extraction, optionally run `import` on the source to produce the `.md` file +- Print a summary: styles preserved, content extracted, paths written + +**Wire into REST and MCP:** +- REST: `POST /template/extract` — multipart upload of `source.docx`; returns + `TemplateExtractionResult` + download URLs for template and MD +- MCP: `extract_template(source_path: str, template_out: str, content_out: str)` + +### End-to-end regression test + +Add `tests/regression/test_word_first_roundtrip.py`: + +``` +Fixture: tests/regression/fixtures/word_first/source.docx + — A representative Word document with headings, body text, a table, + an image, and a footer. Committed to the repo as a binary fixture. + +Test: test_word_first_roundtrip + 1. extract_template(source.docx) → template.docx + content.md + 2. Assert template.docx has zero body paragraphs + 3. Assert template.docx preserves at least the styles present in source.docx + 4. Assert content.md is non-empty and contains the expected headings + 5. build(manifest pointing at content.md + template.docx) → rebuilt.docx + 6. import(rebuilt.docx) → reimported.md + 7. Assert reimported.md is structurally equivalent to content.md + (use differ.py; assert zero structural drift) + +Test: test_template_extraction_idempotent + 1. extract_template(source.docx) → template_a.docx + 2. extract_template(template_a.docx) → template_b.docx + 3. Assert template_b has same style set as template_a (extraction of an + already-empty template is a no-op) +``` + +**Fixture creation:** +- Create `tests/regression/fixtures/word_first/` directory +- Programmatically generate `source.docx` using `python-docx` in a fixture-generator + script (`tests/regression/fixtures/word_first/generate.py`) — this keeps the binary + reproducible from source +- Commit the generated `source.docx` as a stable binary fixture (tracked in git) + +### Success criteria for T06 + +1. `markidocx template extract source.docx` produces a valid content-free template + and a Markdown content file +2. The extracted template + content can be built back into a DOCX via `markidocx build` +3. The rebuilt DOCX imports cleanly with zero structural drift against the extracted + content +4. `test_word_first_roundtrip` passes in CI +5. REST and MCP surfaces expose the new capability + +--- + +## Execution order + +- T01, T02, T03 are independent — can be worked in any order or in parallel +- T04 depends on T02 (the evidence CLI command exposes the assembled set) +- T05 is independent — can be worked at any time +- T06 is independent of T01–T05 but benefits from T04 (evidence for the rebuild step) + +## Updating task status + +``` +status: todo → status: in_progress (when you start it) +status: in_progress → status: done (when verified complete) +``` + +When every task is `done`, set the frontmatter `status: done`. + +## Success criteria + +Before marking the workplan done: + +1. Every task block has `status: done` +2. Workplan frontmatter `status: done` +3. Full test suite passes (`pytest --tb=short -q`) +4. `ruff check` and `mypy src/` clean +5. `markidocx inspect`, `markidocx test`, `markidocx evidence`, `markidocx template extract` + all present and functional +6. `GET /styles` returns real style data (not `[]`) +7. `markidocx evidence ` returns an `EvidenceSet` with `overall_status` +8. `test_word_first_roundtrip` passes +9. LEVEL3 edge-case tests added and passing