chore: add WP-0007 — Interface Completeness & Evidence
Some checks failed
CI / test (3.11) (push) Has been cancelled
CI / test (3.12) (push) Has been cancelled
CI / coverage (push) Has been cancelled

Workplan covering the remaining FRS v0.2 gaps: CLI parity (inspect, test,
evidence commands), style listing stub replacement, evidence assembly
strengthening, LEVEL3 edge-case coverage, and a new Word-first round-trip
capability (template extraction + rebuild verification).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-03-17 16:23:59 +00:00
parent 6cf973b017
commit 893b9fa57b

View File

@@ -0,0 +1,388 @@
---
id: MRKD-WP-0007
type: workplan
domain: markitect
repo: marki-docx
status: active
state_hub_workstream_id: 61701224-0813-4258-9308-025bcec41780
created: 2026-03-17
updated: 2026-03-17
---
# MRKD-WP-0007 — Interface Completeness & Evidence
Close the remaining FRS v0.2 gaps identified after WP-0001 through WP-0006.
The system is ~92% complete; this workplan brings it to full FRS coverage.
Three clusters of functional gaps plus one new capability:
1. **CLI parity**`inspect`, `test`, and `evidence` commands exist in REST and MCP
but are absent from the CLI (FR-806, FR-810, FR-1409)
2. **Style listing stub**`GET /styles` and MCP `list_styles` return `[]`; real
style metadata enumeration is needed (FR-907)
3. **Evidence assembly** — individual reports exist but the release evidence set
has no unified aggregation or completeness disclosure (FR-14061408, FR-1413)
4. **LEVEL3 edge-case coverage** — core paths are tested; targeted tests needed for
diagram source mutation, bibliography ambiguity, and processor-dependency matrix
(FR-534, FR-538, FR-542)
5. **Word-first round-trip** — new end-to-end capability: extract content and a
content-free template from an existing DOCX, then verify that MD + template → DOCX
reproduces the original document
**Scope:** FR-806, FR-810, FR-907, FR-1409, FR-14061408, FR-1413,
FR-534, FR-538, FR-542, new template-extraction capability
**Out of scope:** new document families, non-DOCX output formats
**Depends on:** WP-0001 through WP-0006 — all complete
---
## T01 — Add `markidocx inspect` and `markidocx test` CLI commands
```task
id: MRKD-WP-0007-T01
status: todo
priority: high
state_hub_task_id: f77db529-b17b-4462-a704-2b9a3dbdc892
```
The underlying logic for both commands already exists and is exposed via MCP
(`inspect_project`, `run_tests`) and REST. This task wires them into the CLI.
**`markidocx inspect <manifest>`** (FR-806)
- Calls the same project-inspection logic as `inspect_project` in MCP
- Outputs: source files, feature level, template family, detected LEVEL3 constructs,
capability disclosure (which renderers/processors are available)
- `--json` flag: machine-readable output
- Mirrors the REST `GET /inspect` response structure
**`markidocx test <manifest>`** (FR-810)
- Runs the regression test suite for the project (same as MCP `run_tests`)
- Outputs: pass/fail counts, skipped tests, any failures with locations
- `--json` flag: machine-readable output
- Exit code 0 on pass, 1 on any failure
Implementation notes:
- Add `@app.command()` entries in `cli.py`; delegate to existing logic in
`builder.py` / `level3.py` / workflows
- Update `test_interface_parity.py` to assert CLI/REST/MCP parity for both commands
- Add unit tests in `tests/test_cli_inspect_test.py`
Deliverable: `markidocx inspect <manifest>` and `markidocx test <manifest>` work;
interface parity tests pass.
---
## T02 — Add `markidocx evidence` CLI command
```task
id: MRKD-WP-0007-T02
status: todo
priority: high
state_hub_task_id: 0af8c5bb-c01b-48cf-9895-f6c8033b0606
```
Evidence retrieval is exposed via REST (`GET /evidence/{run_id}`) and MCP
(`get_evidence`) but has no CLI surface (FR-1409, FR-814).
**`markidocx evidence <run_id>`**
- Accepts a `run_id` (returned by `build`, `import`, `compare`, `workflow`)
- Retrieves the full evidence record from the evidence store
- Outputs: human-readable summary (validation result, warnings, drift counts,
overall pass/warn/fail status)
- `--json` flag: full machine-readable evidence record
- `--output <path>` flag: write evidence JSON to file
**`markidocx evidence list`** (subcommand)
- Lists run IDs available in the evidence store, newest first
- `--limit N` (default 10)
- `--json` flag
Implementation notes:
- Extend `cli.py` with an `evidence` group using `typer.Typer()`
- Delegates to `evidence.py` store
- Add `--run-id` output to existing `build`, `import`, `compare` commands so the
user knows what ID to retrieve (currently run_id is only in JSON output)
- Update `test_interface_parity.py` to assert parity
Deliverable: `markidocx evidence <run_id>` and `markidocx evidence list` work and
are parity-tested against REST and MCP.
---
## T03 — Implement style listing (replace stub in REST and MCP)
```task
id: MRKD-WP-0007-T03
status: todo
priority: medium
state_hub_task_id: e26c824c-868f-470e-bdfc-e1ae18aa7ebe
```
`GET /styles` (FR-907) and MCP `list_styles` both return `[]`. The template
files (`.docx`) already contain named paragraph and character styles; they just
need to be enumerated.
**Style metadata model:**
```python
@dataclass
class StyleEntry:
name: str # e.g. "Heading 1", "Body Text"
style_id: str # Word's internal ID, e.g. "Heading1"
type: str # "paragraph" | "character" | "table" | "numbering"
family: str # template family this style belongs to, e.g. "article"
built_in: bool # True if a Word built-in style
```
**`list_styles(family: str | None) -> list[StyleEntry]`** in `templates.py`:
- Opens the template DOCX for the given family (or default)
- Enumerates all styles via `python-docx`'s `document.styles`
- Returns `StyleEntry` list sorted by type then name
**Wire into interfaces:**
- REST `GET /styles?family=article``list[StyleEntry]` as JSON
- MCP `list_styles(family=...)` → same
- CLI `markidocx template styles [--family article]` → tabular output (already has
`template_app` Typer sub-app)
**Tests:**
- `test_templates.py`: assert at least the standard heading/body styles are present
for each built-in family
- Interface parity test: REST, MCP, CLI all return the same set for the same family
Deliverable: `markidocx template styles`, `GET /styles`, `list_styles()` return real
style data for all three built-in families.
---
## T04 — Strengthen evidence assembly — unified status summary and composition disclosure
```task
id: MRKD-WP-0007-T04
status: todo
priority: medium
state_hub_task_id: d9ef5925-f70f-4e97-a2d4-6932c4c531d6
```
Individual evidence records (validation, build, import, drift) exist but there is no
formal aggregation into a release evidence set (FR-14061408, FR-1413).
**Release evidence set structure** (`EvidenceSet` in `evidence.py`):
```python
@dataclass
class EvidenceSet:
run_id: str
created_at: str
manifest_path: str
components: list[str] # which reports are present (FR-1407)
overall_status: str # "pass" | "pass-with-warnings" | "fail" (FR-1408)
validation_result: ... | None
build_result: ... | None
import_result: ... | None
drift_result: ... | None
warnings: list[WarningRecord] # aggregated across all components
completeness_note: str | None # which expected components are absent (FR-1413)
```
**`assemble_evidence_set(run_id: str) -> EvidenceSet`**:
- Reads all component records for the run from the evidence store
- Derives `overall_status`: `fail` if any component failed; `pass-with-warnings` if
any warnings exist; `pass` otherwise
- Sets `completeness_note` if expected components are absent for the workflow type
(e.g. a roundtrip workflow should have build + import + drift; if drift is absent,
note it)
- Enumerates `components` list (FR-1407)
**Wire into interfaces:**
- REST `GET /evidence/{run_id}` → return `EvidenceSet` instead of raw record
- MCP `get_evidence(run_id)` → same
- CLI `markidocx evidence <run_id>` → display `EvidenceSet` summary
- Workflow commands: assemble and persist the evidence set at workflow completion
**Tests:**
- `test_evidence.py`: assert `assemble_evidence_set` returns correct `overall_status`
for pass / pass-with-warnings / fail scenarios
- Assert `components` enumeration is accurate
- Assert `completeness_note` fires when a component is absent
Deliverable: All three interfaces return a coherent `EvidenceSet` with `overall_status`,
`components`, and `completeness_note`. Existing evidence tests still pass.
---
## T05 — LEVEL3 edge-case coverage
```task
id: MRKD-WP-0007-T05
status: todo
priority: low
state_hub_task_id: 20789d1c-4495-468f-bbb7-912e63e804e4
```
Core LEVEL3 paths are tested; this task adds targeted tests for three
undertested edge-case areas.
**FR-534 — Diagram source mutation on round-trip**
- Test: build a DOCX with a mermaid block; manually alter the alt-text source marker
in the DOCX (simulate editorial mutation of the embedded diagram); import → assert
that `differ.py` classifies the change as `structural` (not silently dropped)
- Test: assert that a diagram block with empty source produces a `WarningRecord`
**FR-538 — Processor dependency version matrix**
- Test: mock `shutil.which("mmdc")` to return a path; mock the subprocess call to
return `mmdc --version``"10.x.x"` (supported) vs. `"8.x.x"` (too old)
- Assert that an outdated renderer produces `WarningRecord(reason="renderer-version-unsupported")`
rather than silently falling back (requires adding version-check logic to renderer
backends in `diagrams.py` if not already present)
- If version-checking is not yet in `diagrams.py`, add it as part of this task
**FR-542 — Bibliography ambiguity edge cases**
- Test: document with two citations sharing the same key → assert `WarningRecord`
- Test: document with a citation key that has no corresponding reference entry →
assert `WarningRecord(reason="citation-key-missing")`
- Test: round-trip of a references section with special characters in author names
**Tests location:** extend `tests/test_level3_diagrams.py`, `tests/test_level3_bibliography.py`
Deliverable: All three edge-case areas have at least two targeted tests each.
Existing LEVEL3 tests still pass.
---
## T06 — End-to-end Word-first round-trip: template extraction and rebuild verification
```task
id: MRKD-WP-0007-T06
status: todo
priority: high
state_hub_task_id: 0c16c598-bd49-4721-89a3-e989e1d36879
```
This task delivers a new capability: given an existing Word document as the starting
point, marki-docx can decompose it into a Markdown content file and a content-free
DOCX template, and then verify that recombining the two recreates the original document.
This closes the loop on the round-trip: the existing flow is MD → DOCX → MD; this
adds DOCX → (MD + template) → DOCX, making Word-authored documents first-class inputs.
### New command: `markidocx template extract <source.docx>`
Extracts the structural and stylistic shell of `source.docx` — keeping all styles,
page setup, headers/footers, section properties, and theme data — while removing all
body content (paragraphs, tables, figures, etc.).
```
markidocx template extract <source.docx> \
[--template-out <template.docx>] # default: <source>-template.docx
[--content-out <content.md>] # default: <source>.md (runs import)
[--family <name>] # register extracted template under this family name
[--json]
```
**Outputs:**
1. `<template.docx>` — the content-free shell (styles preserved, body empty)
2. `<content.md>` — the Markdown content extracted via the existing `import` path
**Implementation in `templates.py`:**
```python
def extract_template(source_path: Path, template_out: Path) -> TemplateExtractionResult:
"""
Open source_path with python-docx. Copy all styles, page setup,
headers/footers, and theme. Clear the document body (remove all
paragraphs and tables). Save to template_out.
"""
```
`TemplateExtractionResult`:
```python
@dataclass
class TemplateExtractionResult:
template_path: Path
styles_preserved: int # count of styles copied
warnings: list[WarningRecord]
```
**Wire into CLI:**
- `template_app` already exists in `cli.py`; add `extract` subcommand
- After extraction, optionally run `import` on the source to produce the `.md` file
- Print a summary: styles preserved, content extracted, paths written
**Wire into REST and MCP:**
- REST: `POST /template/extract` — multipart upload of `source.docx`; returns
`TemplateExtractionResult` + download URLs for template and MD
- MCP: `extract_template(source_path: str, template_out: str, content_out: str)`
### End-to-end regression test
Add `tests/regression/test_word_first_roundtrip.py`:
```
Fixture: tests/regression/fixtures/word_first/source.docx
— A representative Word document with headings, body text, a table,
an image, and a footer. Committed to the repo as a binary fixture.
Test: test_word_first_roundtrip
1. extract_template(source.docx) → template.docx + content.md
2. Assert template.docx has zero body paragraphs
3. Assert template.docx preserves at least the styles present in source.docx
4. Assert content.md is non-empty and contains the expected headings
5. build(manifest pointing at content.md + template.docx) → rebuilt.docx
6. import(rebuilt.docx) → reimported.md
7. Assert reimported.md is structurally equivalent to content.md
(use differ.py; assert zero structural drift)
Test: test_template_extraction_idempotent
1. extract_template(source.docx) → template_a.docx
2. extract_template(template_a.docx) → template_b.docx
3. Assert template_b has same style set as template_a (extraction of an
already-empty template is a no-op)
```
**Fixture creation:**
- Create `tests/regression/fixtures/word_first/` directory
- Programmatically generate `source.docx` using `python-docx` in a fixture-generator
script (`tests/regression/fixtures/word_first/generate.py`) — this keeps the binary
reproducible from source
- Commit the generated `source.docx` as a stable binary fixture (tracked in git)
### Success criteria for T06
1. `markidocx template extract source.docx` produces a valid content-free template
and a Markdown content file
2. The extracted template + content can be built back into a DOCX via `markidocx build`
3. The rebuilt DOCX imports cleanly with zero structural drift against the extracted
content
4. `test_word_first_roundtrip` passes in CI
5. REST and MCP surfaces expose the new capability
---
## Execution order
- T01, T02, T03 are independent — can be worked in any order or in parallel
- T04 depends on T02 (the evidence CLI command exposes the assembled set)
- T05 is independent — can be worked at any time
- T06 is independent of T01T05 but benefits from T04 (evidence for the rebuild step)
## Updating task status
```
status: todo → status: in_progress (when you start it)
status: in_progress → status: done (when verified complete)
```
When every task is `done`, set the frontmatter `status: done`.
## Success criteria
Before marking the workplan done:
1. Every task block has `status: done`
2. Workplan frontmatter `status: done`
3. Full test suite passes (`pytest --tb=short -q`)
4. `ruff check` and `mypy src/` clean
5. `markidocx inspect`, `markidocx test`, `markidocx evidence`, `markidocx template extract`
all present and functional
6. `GET /styles` returns real style data (not `[]`)
7. `markidocx evidence <run_id>` returns an `EvidenceSet` with `overall_status`
8. `test_word_first_roundtrip` passes
9. LEVEL3 edge-case tests added and passing