chore: add WP-0007 — Interface Completeness & Evidence

Workplan covering the remaining FRS v0.2 gaps: CLI parity (inspect, test, evidence commands), style listing stub replacement, evidence assembly strengthening, LEVEL3 edge-case coverage, and a new Word-first round-trip capability (template extraction + rebuild verification). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-17 16:23:59 +00:00
parent 6cf973b017
commit 893b9fa57b
1 changed files with 388 additions and 0 deletions
--- a/workplans/MRKD-WP-0007-interface-completeness-evidence.md
+++ b/workplans/MRKD-WP-0007-interface-completeness-evidence.md
@@ -0,0 +1,388 @@
+---
+id: MRKD-WP-0007
+type: workplan
+domain: markitect
+repo: marki-docx
+status: active
+state_hub_workstream_id: 61701224-0813-4258-9308-025bcec41780
+created: 2026-03-17
+updated: 2026-03-17
+---
+
+# MRKD-WP-0007 — Interface Completeness & Evidence
+
+Close the remaining FRS v0.2 gaps identified after WP-0001 through WP-0006.
+The system is ~92% complete; this workplan brings it to full FRS coverage.
+
+Three clusters of functional gaps plus one new capability:
+
+1. **CLI parity** — `inspect`, `test`, and `evidence` commands exist in REST and MCP
+   but are absent from the CLI (FR-806, FR-810, FR-1409)
+2. **Style listing stub** — `GET /styles` and MCP `list_styles` return `[]`; real
+   style metadata enumeration is needed (FR-907)
+3. **Evidence assembly** — individual reports exist but the release evidence set
+   has no unified aggregation or completeness disclosure (FR-1406–1408, FR-1413)
+4. **LEVEL3 edge-case coverage** — core paths are tested; targeted tests needed for
+   diagram source mutation, bibliography ambiguity, and processor-dependency matrix
+   (FR-534, FR-538, FR-542)
+5. **Word-first round-trip** — new end-to-end capability: extract content and a
+   content-free template from an existing DOCX, then verify that MD + template → DOCX
+   reproduces the original document
+
+**Scope:** FR-806, FR-810, FR-907, FR-1409, FR-1406–1408, FR-1413,
+FR-534, FR-538, FR-542, new template-extraction capability
+**Out of scope:** new document families, non-DOCX output formats
+**Depends on:** WP-0001 through WP-0006 — all complete
+
+---
+
+## T01 — Add `markidocx inspect` and `markidocx test` CLI commands
+
+```task
+id: MRKD-WP-0007-T01
+status: todo
+priority: high
+state_hub_task_id: f77db529-b17b-4462-a704-2b9a3dbdc892
+```
+
+The underlying logic for both commands already exists and is exposed via MCP
+(`inspect_project`, `run_tests`) and REST. This task wires them into the CLI.
+
+**`markidocx inspect <manifest>`** (FR-806)
+- Calls the same project-inspection logic as `inspect_project` in MCP
+- Outputs: source files, feature level, template family, detected LEVEL3 constructs,
+  capability disclosure (which renderers/processors are available)
+- `--json` flag: machine-readable output
+- Mirrors the REST `GET /inspect` response structure
+
+**`markidocx test <manifest>`** (FR-810)
+- Runs the regression test suite for the project (same as MCP `run_tests`)
+- Outputs: pass/fail counts, skipped tests, any failures with locations
+- `--json` flag: machine-readable output
+- Exit code 0 on pass, 1 on any failure
+
+Implementation notes:
+- Add `@app.command()` entries in `cli.py`; delegate to existing logic in
+  `builder.py` / `level3.py` / workflows
+- Update `test_interface_parity.py` to assert CLI/REST/MCP parity for both commands
+- Add unit tests in `tests/test_cli_inspect_test.py`
+
+Deliverable: `markidocx inspect <manifest>` and `markidocx test <manifest>` work;
+interface parity tests pass.
+
+---
+
+## T02 — Add `markidocx evidence` CLI command
+
+```task
+id: MRKD-WP-0007-T02
+status: todo
+priority: high
+state_hub_task_id: 0af8c5bb-c01b-48cf-9895-f6c8033b0606
+```
+
+Evidence retrieval is exposed via REST (`GET /evidence/{run_id}`) and MCP
+(`get_evidence`) but has no CLI surface (FR-1409, FR-814).
+
+**`markidocx evidence <run_id>`**
+- Accepts a `run_id` (returned by `build`, `import`, `compare`, `workflow`)
+- Retrieves the full evidence record from the evidence store
+- Outputs: human-readable summary (validation result, warnings, drift counts,
+  overall pass/warn/fail status)
+- `--json` flag: full machine-readable evidence record
+- `--output <path>` flag: write evidence JSON to file
+
+**`markidocx evidence list`** (subcommand)
+- Lists run IDs available in the evidence store, newest first
+- `--limit N` (default 10)
+- `--json` flag
+
+Implementation notes:
+- Extend `cli.py` with an `evidence` group using `typer.Typer()`
+- Delegates to `evidence.py` store
+- Add `--run-id` output to existing `build`, `import`, `compare` commands so the
+  user knows what ID to retrieve (currently run_id is only in JSON output)
+- Update `test_interface_parity.py` to assert parity
+
+Deliverable: `markidocx evidence <run_id>` and `markidocx evidence list` work and
+are parity-tested against REST and MCP.
+
+---
+
+## T03 — Implement style listing (replace stub in REST and MCP)
+
+```task
+id: MRKD-WP-0007-T03
+status: todo
+priority: medium
+state_hub_task_id: e26c824c-868f-470e-bdfc-e1ae18aa7ebe
+```
+
+`GET /styles` (FR-907) and MCP `list_styles` both return `[]`. The template
+files (`.docx`) already contain named paragraph and character styles; they just
+need to be enumerated.
+
+**Style metadata model:**
+```python
+@dataclass
+class StyleEntry:
+    name: str          # e.g. "Heading 1", "Body Text"
+    style_id: str      # Word's internal ID, e.g. "Heading1"
+    type: str          # "paragraph" | "character" | "table" | "numbering"
+    family: str        # template family this style belongs to, e.g. "article"
+    built_in: bool     # True if a Word built-in style
+```
+
+**`list_styles(family: str | None) -> list[StyleEntry]`** in `templates.py`:
+- Opens the template DOCX for the given family (or default)
+- Enumerates all styles via `python-docx`'s `document.styles`
+- Returns `StyleEntry` list sorted by type then name
+
+**Wire into interfaces:**
+- REST `GET /styles?family=article` → `list[StyleEntry]` as JSON
+- MCP `list_styles(family=...)` → same
+- CLI `markidocx template styles [--family article]` → tabular output (already has
+  `template_app` Typer sub-app)
+
+**Tests:**
+- `test_templates.py`: assert at least the standard heading/body styles are present
+  for each built-in family
+- Interface parity test: REST, MCP, CLI all return the same set for the same family
+
+Deliverable: `markidocx template styles`, `GET /styles`, `list_styles()` return real
+style data for all three built-in families.
+
+---
+
+## T04 — Strengthen evidence assembly — unified status summary and composition disclosure
+
+```task
+id: MRKD-WP-0007-T04
+status: todo
+priority: medium
+state_hub_task_id: d9ef5925-f70f-4e97-a2d4-6932c4c531d6
+```
+
+Individual evidence records (validation, build, import, drift) exist but there is no
+formal aggregation into a release evidence set (FR-1406–1408, FR-1413).
+
+**Release evidence set structure** (`EvidenceSet` in `evidence.py`):
+```python
+@dataclass
+class EvidenceSet:
+    run_id: str
+    created_at: str
+    manifest_path: str
+    components: list[str]          # which reports are present (FR-1407)
+    overall_status: str            # "pass" | "pass-with-warnings" | "fail" (FR-1408)
+    validation_result: ... | None
+    build_result: ... | None
+    import_result: ... | None
+    drift_result: ... | None
+    warnings: list[WarningRecord]  # aggregated across all components
+    completeness_note: str | None  # which expected components are absent (FR-1413)
+```
+
+**`assemble_evidence_set(run_id: str) -> EvidenceSet`**:
+- Reads all component records for the run from the evidence store
+- Derives `overall_status`: `fail` if any component failed; `pass-with-warnings` if
+  any warnings exist; `pass` otherwise
+- Sets `completeness_note` if expected components are absent for the workflow type
+  (e.g. a roundtrip workflow should have build + import + drift; if drift is absent,
+  note it)
+- Enumerates `components` list (FR-1407)
+
+**Wire into interfaces:**
+- REST `GET /evidence/{run_id}` → return `EvidenceSet` instead of raw record
+- MCP `get_evidence(run_id)` → same
+- CLI `markidocx evidence <run_id>` → display `EvidenceSet` summary
+- Workflow commands: assemble and persist the evidence set at workflow completion
+
+**Tests:**
+- `test_evidence.py`: assert `assemble_evidence_set` returns correct `overall_status`
+  for pass / pass-with-warnings / fail scenarios
+- Assert `components` enumeration is accurate
+- Assert `completeness_note` fires when a component is absent
+
+Deliverable: All three interfaces return a coherent `EvidenceSet` with `overall_status`,
+`components`, and `completeness_note`. Existing evidence tests still pass.
+
+---
+
+## T05 — LEVEL3 edge-case coverage
+
+```task
+id: MRKD-WP-0007-T05
+status: todo
+priority: low
+state_hub_task_id: 20789d1c-4495-468f-bbb7-912e63e804e4
+```
+
+Core LEVEL3 paths are tested; this task adds targeted tests for three
+undertested edge-case areas.
+
+**FR-534 — Diagram source mutation on round-trip**
+- Test: build a DOCX with a mermaid block; manually alter the alt-text source marker
+  in the DOCX (simulate editorial mutation of the embedded diagram); import → assert
+  that `differ.py` classifies the change as `structural` (not silently dropped)
+- Test: assert that a diagram block with empty source produces a `WarningRecord`
+
+**FR-538 — Processor dependency version matrix**
+- Test: mock `shutil.which("mmdc")` to return a path; mock the subprocess call to
+  return `mmdc --version` → `"10.x.x"` (supported) vs. `"8.x.x"` (too old)
+- Assert that an outdated renderer produces `WarningRecord(reason="renderer-version-unsupported")`
+  rather than silently falling back (requires adding version-check logic to renderer
+  backends in `diagrams.py` if not already present)
+- If version-checking is not yet in `diagrams.py`, add it as part of this task
+
+**FR-542 — Bibliography ambiguity edge cases**
+- Test: document with two citations sharing the same key → assert `WarningRecord`
+- Test: document with a citation key that has no corresponding reference entry →
+  assert `WarningRecord(reason="citation-key-missing")`
+- Test: round-trip of a references section with special characters in author names
+
+**Tests location:** extend `tests/test_level3_diagrams.py`, `tests/test_level3_bibliography.py`
+
+Deliverable: All three edge-case areas have at least two targeted tests each.
+Existing LEVEL3 tests still pass.
+
+---
+
+## T06 — End-to-end Word-first round-trip: template extraction and rebuild verification
+
+```task
+id: MRKD-WP-0007-T06
+status: todo
+priority: high
+state_hub_task_id: 0c16c598-bd49-4721-89a3-e989e1d36879
+```
+
+This task delivers a new capability: given an existing Word document as the starting
+point, marki-docx can decompose it into a Markdown content file and a content-free
+DOCX template, and then verify that recombining the two recreates the original document.
+
+This closes the loop on the round-trip: the existing flow is MD → DOCX → MD; this
+adds DOCX → (MD + template) → DOCX, making Word-authored documents first-class inputs.
+
+### New command: `markidocx template extract <source.docx>`
+
+Extracts the structural and stylistic shell of `source.docx` — keeping all styles,
+page setup, headers/footers, section properties, and theme data — while removing all
+body content (paragraphs, tables, figures, etc.).
+
+```
+markidocx template extract <source.docx> \
+    [--template-out <template.docx>]   # default: <source>-template.docx
+    [--content-out <content.md>]       # default: <source>.md  (runs import)
+    [--family <name>]                  # register extracted template under this family name
+    [--json]
+```
+
+**Outputs:**
+1. `<template.docx>` — the content-free shell (styles preserved, body empty)
+2. `<content.md>` — the Markdown content extracted via the existing `import` path
+
+**Implementation in `templates.py`:**
+```python
+def extract_template(source_path: Path, template_out: Path) -> TemplateExtractionResult:
+    """
+    Open source_path with python-docx. Copy all styles, page setup,
+    headers/footers, and theme. Clear the document body (remove all
+    paragraphs and tables). Save to template_out.
+    """
+```
+
+`TemplateExtractionResult`:
+```python
+@dataclass
+class TemplateExtractionResult:
+    template_path: Path
+    styles_preserved: int      # count of styles copied
+    warnings: list[WarningRecord]
+```
+
+**Wire into CLI:**
+- `template_app` already exists in `cli.py`; add `extract` subcommand
+- After extraction, optionally run `import` on the source to produce the `.md` file
+- Print a summary: styles preserved, content extracted, paths written
+
+**Wire into REST and MCP:**
+- REST: `POST /template/extract` — multipart upload of `source.docx`; returns
+  `TemplateExtractionResult` + download URLs for template and MD
+- MCP: `extract_template(source_path: str, template_out: str, content_out: str)`
+
+### End-to-end regression test
+
+Add `tests/regression/test_word_first_roundtrip.py`:
+
+```
+Fixture: tests/regression/fixtures/word_first/source.docx
+  — A representative Word document with headings, body text, a table,
+    an image, and a footer. Committed to the repo as a binary fixture.
+
+Test: test_word_first_roundtrip
+  1. extract_template(source.docx) → template.docx + content.md
+  2. Assert template.docx has zero body paragraphs
+  3. Assert template.docx preserves at least the styles present in source.docx
+  4. Assert content.md is non-empty and contains the expected headings
+  5. build(manifest pointing at content.md + template.docx) → rebuilt.docx
+  6. import(rebuilt.docx) → reimported.md
+  7. Assert reimported.md is structurally equivalent to content.md
+     (use differ.py; assert zero structural drift)
+
+Test: test_template_extraction_idempotent
+  1. extract_template(source.docx) → template_a.docx
+  2. extract_template(template_a.docx) → template_b.docx
+  3. Assert template_b has same style set as template_a (extraction of an
+     already-empty template is a no-op)
+```
+
+**Fixture creation:**
+- Create `tests/regression/fixtures/word_first/` directory
+- Programmatically generate `source.docx` using `python-docx` in a fixture-generator
+  script (`tests/regression/fixtures/word_first/generate.py`) — this keeps the binary
+  reproducible from source
+- Commit the generated `source.docx` as a stable binary fixture (tracked in git)
+
+### Success criteria for T06
+
+1. `markidocx template extract source.docx` produces a valid content-free template
+   and a Markdown content file
+2. The extracted template + content can be built back into a DOCX via `markidocx build`
+3. The rebuilt DOCX imports cleanly with zero structural drift against the extracted
+   content
+4. `test_word_first_roundtrip` passes in CI
+5. REST and MCP surfaces expose the new capability
+
+---
+
+## Execution order
+
+- T01, T02, T03 are independent — can be worked in any order or in parallel
+- T04 depends on T02 (the evidence CLI command exposes the assembled set)
+- T05 is independent — can be worked at any time
+- T06 is independent of T01–T05 but benefits from T04 (evidence for the rebuild step)
+
+## Updating task status
+
+```
+status: todo        →  status: in_progress   (when you start it)
+status: in_progress →  status: done          (when verified complete)
+```
+
+When every task is `done`, set the frontmatter `status: done`.
+
+## Success criteria
+
+Before marking the workplan done:
+
+1. Every task block has `status: done`
+2. Workplan frontmatter `status: done`
+3. Full test suite passes (`pytest --tb=short -q`)
+4. `ruff check` and `mypy src/` clean
+5. `markidocx inspect`, `markidocx test`, `markidocx evidence`, `markidocx template extract`
+   all present and functional
+6. `GET /styles` returns real style data (not `[]`)
+7. `markidocx evidence <run_id>` returns an `EvidenceSet` with `overall_status`
+8. `test_word_first_roundtrip` passes
+9. LEVEL3 edge-case tests added and passing