Files
marki-docx/workplans/MRKD-WP-0007-interface-completeness-evidence.md
tegwick cd8d1a3732
Some checks failed
CI / test (3.11) (push) Has been cancelled
CI / test (3.12) (push) Has been cancelled
CI / coverage (push) Has been cancelled
Normalize agent instructions and workplan frontmatter (STATE-WP-0067)
- Align agent files with on-disk workplan prefixes (infer from workplan ids)
- Set workplan domain to registered domain_slug; add topic_slug where applicable
- Repair frontmatter delimiter formatting; migrate legacy task status literals
- Regenerate AGENTS.md, CLAUDE.md, and .claude/rules from State Hub templates
2026-06-22 23:16:27 +02:00

390 lines
15 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
---
id: MRKD-WP-0007
type: workplan
domain: communication
repo: marki-docx
status: done
state_hub_workstream_id: 61701224-0813-4258-9308-025bcec41780
created: 2026-03-17
updated: 2026-03-17
completed: 2026-03-17
---
# MRKD-WP-0007 — Interface Completeness & Evidence
Close the remaining FRS v0.2 gaps identified after WP-0001 through WP-0006.
The system is ~92% complete; this workplan brings it to full FRS coverage.
Three clusters of functional gaps plus one new capability:
1. **CLI parity**`inspect`, `test`, and `evidence` commands exist in REST and MCP
but are absent from the CLI (FR-806, FR-810, FR-1409)
2. **Style listing stub**`GET /styles` and MCP `list_styles` return `[]`; real
style metadata enumeration is needed (FR-907)
3. **Evidence assembly** — individual reports exist but the release evidence set
has no unified aggregation or completeness disclosure (FR-14061408, FR-1413)
4. **LEVEL3 edge-case coverage** — core paths are tested; targeted tests needed for
diagram source mutation, bibliography ambiguity, and processor-dependency matrix
(FR-534, FR-538, FR-542)
5. **Word-first round-trip** — new end-to-end capability: extract content and a
content-free template from an existing DOCX, then verify that MD + template → DOCX
reproduces the original document
**Scope:** FR-806, FR-810, FR-907, FR-1409, FR-14061408, FR-1413,
FR-534, FR-538, FR-542, new template-extraction capability
**Out of scope:** new document families, non-DOCX output formats
**Depends on:** WP-0001 through WP-0006 — all complete
---
## T01 — Add `markidocx inspect` and `markidocx test` CLI commands
```task
id: MRKD-WP-0007-T01
status: done
priority: high
state_hub_task_id: f77db529-b17b-4462-a704-2b9a3dbdc892
```
The underlying logic for both commands already exists and is exposed via MCP
(`inspect_project`, `run_tests`) and REST. This task wires them into the CLI.
**`markidocx inspect <manifest>`** (FR-806)
- Calls the same project-inspection logic as `inspect_project` in MCP
- Outputs: source files, feature level, template family, detected LEVEL3 constructs,
capability disclosure (which renderers/processors are available)
- `--json` flag: machine-readable output
- Mirrors the REST `GET /inspect` response structure
**`markidocx test <manifest>`** (FR-810)
- Runs the regression test suite for the project (same as MCP `run_tests`)
- Outputs: pass/fail counts, skipped tests, any failures with locations
- `--json` flag: machine-readable output
- Exit code 0 on pass, 1 on any failure
Implementation notes:
- Add `@app.command()` entries in `cli.py`; delegate to existing logic in
`builder.py` / `level3.py` / workflows
- Update `test_interface_parity.py` to assert CLI/REST/MCP parity for both commands
- Add unit tests in `tests/test_cli_inspect_test.py`
Deliverable: `markidocx inspect <manifest>` and `markidocx test <manifest>` work;
interface parity tests pass.
---
## T02 — Add `markidocx evidence` CLI command
```task
id: MRKD-WP-0007-T02
status: done
priority: high
state_hub_task_id: 0af8c5bb-c01b-48cf-9895-f6c8033b0606
```
Evidence retrieval is exposed via REST (`GET /evidence/{run_id}`) and MCP
(`get_evidence`) but has no CLI surface (FR-1409, FR-814).
**`markidocx evidence <run_id>`**
- Accepts a `run_id` (returned by `build`, `import`, `compare`, `workflow`)
- Retrieves the full evidence record from the evidence store
- Outputs: human-readable summary (validation result, warnings, drift counts,
overall pass/warn/fail status)
- `--json` flag: full machine-readable evidence record
- `--output <path>` flag: write evidence JSON to file
**`markidocx evidence list`** (subcommand)
- Lists run IDs available in the evidence store, newest first
- `--limit N` (default 10)
- `--json` flag
Implementation notes:
- Extend `cli.py` with an `evidence` group using `typer.Typer()`
- Delegates to `evidence.py` store
- Add `--run-id` output to existing `build`, `import`, `compare` commands so the
user knows what ID to retrieve (currently run_id is only in JSON output)
- Update `test_interface_parity.py` to assert parity
Deliverable: `markidocx evidence <run_id>` and `markidocx evidence list` work and
are parity-tested against REST and MCP.
---
## T03 — Implement style listing (replace stub in REST and MCP)
```task
id: MRKD-WP-0007-T03
status: done
priority: medium
state_hub_task_id: e26c824c-868f-470e-bdfc-e1ae18aa7ebe
```
`GET /styles` (FR-907) and MCP `list_styles` both return `[]`. The template
files (`.docx`) already contain named paragraph and character styles; they just
need to be enumerated.
**Style metadata model:**
```python
@dataclass
class StyleEntry:
name: str # e.g. "Heading 1", "Body Text"
style_id: str # Word's internal ID, e.g. "Heading1"
type: str # "paragraph" | "character" | "table" | "numbering"
family: str # template family this style belongs to, e.g. "article"
built_in: bool # True if a Word built-in style
```
**`list_styles(family: str | None) -> list[StyleEntry]`** in `templates.py`:
- Opens the template DOCX for the given family (or default)
- Enumerates all styles via `python-docx`'s `document.styles`
- Returns `StyleEntry` list sorted by type then name
**Wire into interfaces:**
- REST `GET /styles?family=article``list[StyleEntry]` as JSON
- MCP `list_styles(family=...)` → same
- CLI `markidocx template styles [--family article]` → tabular output (already has
`template_app` Typer sub-app)
**Tests:**
- `test_templates.py`: assert at least the standard heading/body styles are present
for each built-in family
- Interface parity test: REST, MCP, CLI all return the same set for the same family
Deliverable: `markidocx template styles`, `GET /styles`, `list_styles()` return real
style data for all three built-in families.
---
## T04 — Strengthen evidence assembly — unified status summary and composition disclosure
```task
id: MRKD-WP-0007-T04
status: done
priority: medium
state_hub_task_id: d9ef5925-f70f-4e97-a2d4-6932c4c531d6
```
Individual evidence records (validation, build, import, drift) exist but there is no
formal aggregation into a release evidence set (FR-14061408, FR-1413).
**Release evidence set structure** (`EvidenceSet` in `evidence.py`):
```python
@dataclass
class EvidenceSet:
run_id: str
created_at: str
manifest_path: str
components: list[str] # which reports are present (FR-1407)
overall_status: str # "pass" | "pass-with-warnings" | "fail" (FR-1408)
validation_result: ... | None
build_result: ... | None
import_result: ... | None
drift_result: ... | None
warnings: list[WarningRecord] # aggregated across all components
completeness_note: str | None # which expected components are absent (FR-1413)
```
**`assemble_evidence_set(run_id: str) -> EvidenceSet`**:
- Reads all component records for the run from the evidence store
- Derives `overall_status`: `fail` if any component failed; `pass-with-warnings` if
any warnings exist; `pass` otherwise
- Sets `completeness_note` if expected components are absent for the workflow type
(e.g. a roundtrip workflow should have build + import + drift; if drift is absent,
note it)
- Enumerates `components` list (FR-1407)
**Wire into interfaces:**
- REST `GET /evidence/{run_id}` → return `EvidenceSet` instead of raw record
- MCP `get_evidence(run_id)` → same
- CLI `markidocx evidence <run_id>` → display `EvidenceSet` summary
- Workflow commands: assemble and persist the evidence set at workflow completion
**Tests:**
- `test_evidence.py`: assert `assemble_evidence_set` returns correct `overall_status`
for pass / pass-with-warnings / fail scenarios
- Assert `components` enumeration is accurate
- Assert `completeness_note` fires when a component is absent
Deliverable: All three interfaces return a coherent `EvidenceSet` with `overall_status`,
`components`, and `completeness_note`. Existing evidence tests still pass.
---
## T05 — LEVEL3 edge-case coverage
```task
id: MRKD-WP-0007-T05
status: done
priority: low
state_hub_task_id: 20789d1c-4495-468f-bbb7-912e63e804e4
```
Core LEVEL3 paths are tested; this task adds targeted tests for three
undertested edge-case areas.
**FR-534 — Diagram source mutation on round-trip**
- Test: build a DOCX with a mermaid block; manually alter the alt-text source marker
in the DOCX (simulate editorial mutation of the embedded diagram); import → assert
that `differ.py` classifies the change as `structural` (not silently dropped)
- Test: assert that a diagram block with empty source produces a `WarningRecord`
**FR-538 — Processor dependency version matrix**
- Test: mock `shutil.which("mmdc")` to return a path; mock the subprocess call to
return `mmdc --version``"10.x.x"` (supported) vs. `"8.x.x"` (too old)
- Assert that an outdated renderer produces `WarningRecord(reason="renderer-version-unsupported")`
rather than silently falling back (requires adding version-check logic to renderer
backends in `diagrams.py` if not already present)
- If version-checking is not yet in `diagrams.py`, add it as part of this task
**FR-542 — Bibliography ambiguity edge cases**
- Test: document with two citations sharing the same key → assert `WarningRecord`
- Test: document with a citation key that has no corresponding reference entry →
assert `WarningRecord(reason="citation-key-missing")`
- Test: round-trip of a references section with special characters in author names
**Tests location:** extend `tests/test_level3_diagrams.py`, `tests/test_level3_bibliography.py`
Deliverable: All three edge-case areas have at least two targeted tests each.
Existing LEVEL3 tests still pass.
---
## T06 — End-to-end Word-first round-trip: template extraction and rebuild verification
```task
id: MRKD-WP-0007-T06
status: done
priority: high
state_hub_task_id: 0c16c598-bd49-4721-89a3-e989e1d36879
```
This task delivers a new capability: given an existing Word document as the starting
point, marki-docx can decompose it into a Markdown content file and a content-free
DOCX template, and then verify that recombining the two recreates the original document.
This closes the loop on the round-trip: the existing flow is MD → DOCX → MD; this
adds DOCX → (MD + template) → DOCX, making Word-authored documents first-class inputs.
### New command: `markidocx template extract <source.docx>`
Extracts the structural and stylistic shell of `source.docx` — keeping all styles,
page setup, headers/footers, section properties, and theme data — while removing all
body content (paragraphs, tables, figures, etc.).
```
markidocx template extract <source.docx> \
[--template-out <template.docx>] # default: <source>-template.docx
[--content-out <content.md>] # default: <source>.md (runs import)
[--family <name>] # register extracted template under this family name
[--json]
```
**Outputs:**
1. `<template.docx>` — the content-free shell (styles preserved, body empty)
2. `<content.md>` — the Markdown content extracted via the existing `import` path
**Implementation in `templates.py`:**
```python
def extract_template(source_path: Path, template_out: Path) -> TemplateExtractionResult:
"""
Open source_path with python-docx. Copy all styles, page setup,
headers/footers, and theme. Clear the document body (remove all
paragraphs and tables). Save to template_out.
"""
```
`TemplateExtractionResult`:
```python
@dataclass
class TemplateExtractionResult:
template_path: Path
styles_preserved: int # count of styles copied
warnings: list[WarningRecord]
```
**Wire into CLI:**
- `template_app` already exists in `cli.py`; add `extract` subcommand
- After extraction, optionally run `import` on the source to produce the `.md` file
- Print a summary: styles preserved, content extracted, paths written
**Wire into REST and MCP:**
- REST: `POST /template/extract` — multipart upload of `source.docx`; returns
`TemplateExtractionResult` + download URLs for template and MD
- MCP: `extract_template(source_path: str, template_out: str, content_out: str)`
### End-to-end regression test
Add `tests/regression/test_word_first_roundtrip.py`:
```
Fixture: tests/regression/fixtures/word_first/source.docx
— A representative Word document with headings, body text, a table,
an image, and a footer. Committed to the repo as a binary fixture.
Test: test_word_first_roundtrip
1. extract_template(source.docx) → template.docx + content.md
2. Assert template.docx has zero body paragraphs
3. Assert template.docx preserves at least the styles present in source.docx
4. Assert content.md is non-empty and contains the expected headings
5. build(manifest pointing at content.md + template.docx) → rebuilt.docx
6. import(rebuilt.docx) → reimported.md
7. Assert reimported.md is structurally equivalent to content.md
(use differ.py; assert zero structural drift)
Test: test_template_extraction_idempotent
1. extract_template(source.docx) → template_a.docx
2. extract_template(template_a.docx) → template_b.docx
3. Assert template_b has same style set as template_a (extraction of an
already-empty template is a no-op)
```
**Fixture creation:**
- Create `tests/regression/fixtures/word_first/` directory
- Programmatically generate `source.docx` using `python-docx` in a fixture-generator
script (`tests/regression/fixtures/word_first/generate.py`) — this keeps the binary
reproducible from source
- Commit the generated `source.docx` as a stable binary fixture (tracked in git)
### Success criteria for T06
1. `markidocx template extract source.docx` produces a valid content-free template
and a Markdown content file
2. The extracted template + content can be built back into a DOCX via `markidocx build`
3. The rebuilt DOCX imports cleanly with zero structural drift against the extracted
content
4. `test_word_first_roundtrip` passes in CI
5. REST and MCP surfaces expose the new capability
---
## Execution order
- T01, T02, T03 are independent — can be worked in any order or in parallel
- T04 depends on T02 (the evidence CLI command exposes the assembled set)
- T05 is independent — can be worked at any time
- T06 is independent of T01T05 but benefits from T04 (evidence for the rebuild step)
## Updating task status
```
status: todo → status: in_progress (when you start it)
status: in_progress → status: done (when verified complete)
```
When every task is `done`, set the frontmatter `status: done`.
## Success criteria
Before marking the workplan done:
1. Every task block has `status: done`
2. Workplan frontmatter `status: done`
3. Full test suite passes (`pytest --tb=short -q`)
4. `ruff check` and `mypy src/` clean
5. `markidocx inspect`, `markidocx test`, `markidocx evidence`, `markidocx template extract`
all present and functional
6. `GET /styles` returns real style data (not `[]`)
7. `markidocx evidence <run_id>` returns an `EvidenceSet` with `overall_status`
8. `test_word_first_roundtrip` passes
9. LEVEL3 edge-case tests added and passing