generated from coulomb/repo-seed
chore: add WP-0007 — Interface Completeness & Evidence
Workplan covering the remaining FRS v0.2 gaps: CLI parity (inspect, test, evidence commands), style listing stub replacement, evidence assembly strengthening, LEVEL3 edge-case coverage, and a new Word-first round-trip capability (template extraction + rebuild verification). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
388
workplans/MRKD-WP-0007-interface-completeness-evidence.md
Normal file
388
workplans/MRKD-WP-0007-interface-completeness-evidence.md
Normal file
@@ -0,0 +1,388 @@
|
||||
---
|
||||
id: MRKD-WP-0007
|
||||
type: workplan
|
||||
domain: markitect
|
||||
repo: marki-docx
|
||||
status: active
|
||||
state_hub_workstream_id: 61701224-0813-4258-9308-025bcec41780
|
||||
created: 2026-03-17
|
||||
updated: 2026-03-17
|
||||
---
|
||||
|
||||
# MRKD-WP-0007 — Interface Completeness & Evidence
|
||||
|
||||
Close the remaining FRS v0.2 gaps identified after WP-0001 through WP-0006.
|
||||
The system is ~92% complete; this workplan brings it to full FRS coverage.
|
||||
|
||||
Three clusters of functional gaps plus one new capability:
|
||||
|
||||
1. **CLI parity** — `inspect`, `test`, and `evidence` commands exist in REST and MCP
|
||||
but are absent from the CLI (FR-806, FR-810, FR-1409)
|
||||
2. **Style listing stub** — `GET /styles` and MCP `list_styles` return `[]`; real
|
||||
style metadata enumeration is needed (FR-907)
|
||||
3. **Evidence assembly** — individual reports exist but the release evidence set
|
||||
has no unified aggregation or completeness disclosure (FR-1406–1408, FR-1413)
|
||||
4. **LEVEL3 edge-case coverage** — core paths are tested; targeted tests needed for
|
||||
diagram source mutation, bibliography ambiguity, and processor-dependency matrix
|
||||
(FR-534, FR-538, FR-542)
|
||||
5. **Word-first round-trip** — new end-to-end capability: extract content and a
|
||||
content-free template from an existing DOCX, then verify that MD + template → DOCX
|
||||
reproduces the original document
|
||||
|
||||
**Scope:** FR-806, FR-810, FR-907, FR-1409, FR-1406–1408, FR-1413,
|
||||
FR-534, FR-538, FR-542, new template-extraction capability
|
||||
**Out of scope:** new document families, non-DOCX output formats
|
||||
**Depends on:** WP-0001 through WP-0006 — all complete
|
||||
|
||||
---
|
||||
|
||||
## T01 — Add `markidocx inspect` and `markidocx test` CLI commands
|
||||
|
||||
```task
|
||||
id: MRKD-WP-0007-T01
|
||||
status: todo
|
||||
priority: high
|
||||
state_hub_task_id: f77db529-b17b-4462-a704-2b9a3dbdc892
|
||||
```
|
||||
|
||||
The underlying logic for both commands already exists and is exposed via MCP
|
||||
(`inspect_project`, `run_tests`) and REST. This task wires them into the CLI.
|
||||
|
||||
**`markidocx inspect <manifest>`** (FR-806)
|
||||
- Calls the same project-inspection logic as `inspect_project` in MCP
|
||||
- Outputs: source files, feature level, template family, detected LEVEL3 constructs,
|
||||
capability disclosure (which renderers/processors are available)
|
||||
- `--json` flag: machine-readable output
|
||||
- Mirrors the REST `GET /inspect` response structure
|
||||
|
||||
**`markidocx test <manifest>`** (FR-810)
|
||||
- Runs the regression test suite for the project (same as MCP `run_tests`)
|
||||
- Outputs: pass/fail counts, skipped tests, any failures with locations
|
||||
- `--json` flag: machine-readable output
|
||||
- Exit code 0 on pass, 1 on any failure
|
||||
|
||||
Implementation notes:
|
||||
- Add `@app.command()` entries in `cli.py`; delegate to existing logic in
|
||||
`builder.py` / `level3.py` / workflows
|
||||
- Update `test_interface_parity.py` to assert CLI/REST/MCP parity for both commands
|
||||
- Add unit tests in `tests/test_cli_inspect_test.py`
|
||||
|
||||
Deliverable: `markidocx inspect <manifest>` and `markidocx test <manifest>` work;
|
||||
interface parity tests pass.
|
||||
|
||||
---
|
||||
|
||||
## T02 — Add `markidocx evidence` CLI command
|
||||
|
||||
```task
|
||||
id: MRKD-WP-0007-T02
|
||||
status: todo
|
||||
priority: high
|
||||
state_hub_task_id: 0af8c5bb-c01b-48cf-9895-f6c8033b0606
|
||||
```
|
||||
|
||||
Evidence retrieval is exposed via REST (`GET /evidence/{run_id}`) and MCP
|
||||
(`get_evidence`) but has no CLI surface (FR-1409, FR-814).
|
||||
|
||||
**`markidocx evidence <run_id>`**
|
||||
- Accepts a `run_id` (returned by `build`, `import`, `compare`, `workflow`)
|
||||
- Retrieves the full evidence record from the evidence store
|
||||
- Outputs: human-readable summary (validation result, warnings, drift counts,
|
||||
overall pass/warn/fail status)
|
||||
- `--json` flag: full machine-readable evidence record
|
||||
- `--output <path>` flag: write evidence JSON to file
|
||||
|
||||
**`markidocx evidence list`** (subcommand)
|
||||
- Lists run IDs available in the evidence store, newest first
|
||||
- `--limit N` (default 10)
|
||||
- `--json` flag
|
||||
|
||||
Implementation notes:
|
||||
- Extend `cli.py` with an `evidence` group using `typer.Typer()`
|
||||
- Delegates to `evidence.py` store
|
||||
- Add `--run-id` output to existing `build`, `import`, `compare` commands so the
|
||||
user knows what ID to retrieve (currently run_id is only in JSON output)
|
||||
- Update `test_interface_parity.py` to assert parity
|
||||
|
||||
Deliverable: `markidocx evidence <run_id>` and `markidocx evidence list` work and
|
||||
are parity-tested against REST and MCP.
|
||||
|
||||
---
|
||||
|
||||
## T03 — Implement style listing (replace stub in REST and MCP)
|
||||
|
||||
```task
|
||||
id: MRKD-WP-0007-T03
|
||||
status: todo
|
||||
priority: medium
|
||||
state_hub_task_id: e26c824c-868f-470e-bdfc-e1ae18aa7ebe
|
||||
```
|
||||
|
||||
`GET /styles` (FR-907) and MCP `list_styles` both return `[]`. The template
|
||||
files (`.docx`) already contain named paragraph and character styles; they just
|
||||
need to be enumerated.
|
||||
|
||||
**Style metadata model:**
|
||||
```python
|
||||
@dataclass
|
||||
class StyleEntry:
|
||||
name: str # e.g. "Heading 1", "Body Text"
|
||||
style_id: str # Word's internal ID, e.g. "Heading1"
|
||||
type: str # "paragraph" | "character" | "table" | "numbering"
|
||||
family: str # template family this style belongs to, e.g. "article"
|
||||
built_in: bool # True if a Word built-in style
|
||||
```
|
||||
|
||||
**`list_styles(family: str | None) -> list[StyleEntry]`** in `templates.py`:
|
||||
- Opens the template DOCX for the given family (or default)
|
||||
- Enumerates all styles via `python-docx`'s `document.styles`
|
||||
- Returns `StyleEntry` list sorted by type then name
|
||||
|
||||
**Wire into interfaces:**
|
||||
- REST `GET /styles?family=article` → `list[StyleEntry]` as JSON
|
||||
- MCP `list_styles(family=...)` → same
|
||||
- CLI `markidocx template styles [--family article]` → tabular output (already has
|
||||
`template_app` Typer sub-app)
|
||||
|
||||
**Tests:**
|
||||
- `test_templates.py`: assert at least the standard heading/body styles are present
|
||||
for each built-in family
|
||||
- Interface parity test: REST, MCP, CLI all return the same set for the same family
|
||||
|
||||
Deliverable: `markidocx template styles`, `GET /styles`, `list_styles()` return real
|
||||
style data for all three built-in families.
|
||||
|
||||
---
|
||||
|
||||
## T04 — Strengthen evidence assembly — unified status summary and composition disclosure
|
||||
|
||||
```task
|
||||
id: MRKD-WP-0007-T04
|
||||
status: todo
|
||||
priority: medium
|
||||
state_hub_task_id: d9ef5925-f70f-4e97-a2d4-6932c4c531d6
|
||||
```
|
||||
|
||||
Individual evidence records (validation, build, import, drift) exist but there is no
|
||||
formal aggregation into a release evidence set (FR-1406–1408, FR-1413).
|
||||
|
||||
**Release evidence set structure** (`EvidenceSet` in `evidence.py`):
|
||||
```python
|
||||
@dataclass
|
||||
class EvidenceSet:
|
||||
run_id: str
|
||||
created_at: str
|
||||
manifest_path: str
|
||||
components: list[str] # which reports are present (FR-1407)
|
||||
overall_status: str # "pass" | "pass-with-warnings" | "fail" (FR-1408)
|
||||
validation_result: ... | None
|
||||
build_result: ... | None
|
||||
import_result: ... | None
|
||||
drift_result: ... | None
|
||||
warnings: list[WarningRecord] # aggregated across all components
|
||||
completeness_note: str | None # which expected components are absent (FR-1413)
|
||||
```
|
||||
|
||||
**`assemble_evidence_set(run_id: str) -> EvidenceSet`**:
|
||||
- Reads all component records for the run from the evidence store
|
||||
- Derives `overall_status`: `fail` if any component failed; `pass-with-warnings` if
|
||||
any warnings exist; `pass` otherwise
|
||||
- Sets `completeness_note` if expected components are absent for the workflow type
|
||||
(e.g. a roundtrip workflow should have build + import + drift; if drift is absent,
|
||||
note it)
|
||||
- Enumerates `components` list (FR-1407)
|
||||
|
||||
**Wire into interfaces:**
|
||||
- REST `GET /evidence/{run_id}` → return `EvidenceSet` instead of raw record
|
||||
- MCP `get_evidence(run_id)` → same
|
||||
- CLI `markidocx evidence <run_id>` → display `EvidenceSet` summary
|
||||
- Workflow commands: assemble and persist the evidence set at workflow completion
|
||||
|
||||
**Tests:**
|
||||
- `test_evidence.py`: assert `assemble_evidence_set` returns correct `overall_status`
|
||||
for pass / pass-with-warnings / fail scenarios
|
||||
- Assert `components` enumeration is accurate
|
||||
- Assert `completeness_note` fires when a component is absent
|
||||
|
||||
Deliverable: All three interfaces return a coherent `EvidenceSet` with `overall_status`,
|
||||
`components`, and `completeness_note`. Existing evidence tests still pass.
|
||||
|
||||
---
|
||||
|
||||
## T05 — LEVEL3 edge-case coverage
|
||||
|
||||
```task
|
||||
id: MRKD-WP-0007-T05
|
||||
status: todo
|
||||
priority: low
|
||||
state_hub_task_id: 20789d1c-4495-468f-bbb7-912e63e804e4
|
||||
```
|
||||
|
||||
Core LEVEL3 paths are tested; this task adds targeted tests for three
|
||||
undertested edge-case areas.
|
||||
|
||||
**FR-534 — Diagram source mutation on round-trip**
|
||||
- Test: build a DOCX with a mermaid block; manually alter the alt-text source marker
|
||||
in the DOCX (simulate editorial mutation of the embedded diagram); import → assert
|
||||
that `differ.py` classifies the change as `structural` (not silently dropped)
|
||||
- Test: assert that a diagram block with empty source produces a `WarningRecord`
|
||||
|
||||
**FR-538 — Processor dependency version matrix**
|
||||
- Test: mock `shutil.which("mmdc")` to return a path; mock the subprocess call to
|
||||
return `mmdc --version` → `"10.x.x"` (supported) vs. `"8.x.x"` (too old)
|
||||
- Assert that an outdated renderer produces `WarningRecord(reason="renderer-version-unsupported")`
|
||||
rather than silently falling back (requires adding version-check logic to renderer
|
||||
backends in `diagrams.py` if not already present)
|
||||
- If version-checking is not yet in `diagrams.py`, add it as part of this task
|
||||
|
||||
**FR-542 — Bibliography ambiguity edge cases**
|
||||
- Test: document with two citations sharing the same key → assert `WarningRecord`
|
||||
- Test: document with a citation key that has no corresponding reference entry →
|
||||
assert `WarningRecord(reason="citation-key-missing")`
|
||||
- Test: round-trip of a references section with special characters in author names
|
||||
|
||||
**Tests location:** extend `tests/test_level3_diagrams.py`, `tests/test_level3_bibliography.py`
|
||||
|
||||
Deliverable: All three edge-case areas have at least two targeted tests each.
|
||||
Existing LEVEL3 tests still pass.
|
||||
|
||||
---
|
||||
|
||||
## T06 — End-to-end Word-first round-trip: template extraction and rebuild verification
|
||||
|
||||
```task
|
||||
id: MRKD-WP-0007-T06
|
||||
status: todo
|
||||
priority: high
|
||||
state_hub_task_id: 0c16c598-bd49-4721-89a3-e989e1d36879
|
||||
```
|
||||
|
||||
This task delivers a new capability: given an existing Word document as the starting
|
||||
point, marki-docx can decompose it into a Markdown content file and a content-free
|
||||
DOCX template, and then verify that recombining the two recreates the original document.
|
||||
|
||||
This closes the loop on the round-trip: the existing flow is MD → DOCX → MD; this
|
||||
adds DOCX → (MD + template) → DOCX, making Word-authored documents first-class inputs.
|
||||
|
||||
### New command: `markidocx template extract <source.docx>`
|
||||
|
||||
Extracts the structural and stylistic shell of `source.docx` — keeping all styles,
|
||||
page setup, headers/footers, section properties, and theme data — while removing all
|
||||
body content (paragraphs, tables, figures, etc.).
|
||||
|
||||
```
|
||||
markidocx template extract <source.docx> \
|
||||
[--template-out <template.docx>] # default: <source>-template.docx
|
||||
[--content-out <content.md>] # default: <source>.md (runs import)
|
||||
[--family <name>] # register extracted template under this family name
|
||||
[--json]
|
||||
```
|
||||
|
||||
**Outputs:**
|
||||
1. `<template.docx>` — the content-free shell (styles preserved, body empty)
|
||||
2. `<content.md>` — the Markdown content extracted via the existing `import` path
|
||||
|
||||
**Implementation in `templates.py`:**
|
||||
```python
|
||||
def extract_template(source_path: Path, template_out: Path) -> TemplateExtractionResult:
|
||||
"""
|
||||
Open source_path with python-docx. Copy all styles, page setup,
|
||||
headers/footers, and theme. Clear the document body (remove all
|
||||
paragraphs and tables). Save to template_out.
|
||||
"""
|
||||
```
|
||||
|
||||
`TemplateExtractionResult`:
|
||||
```python
|
||||
@dataclass
|
||||
class TemplateExtractionResult:
|
||||
template_path: Path
|
||||
styles_preserved: int # count of styles copied
|
||||
warnings: list[WarningRecord]
|
||||
```
|
||||
|
||||
**Wire into CLI:**
|
||||
- `template_app` already exists in `cli.py`; add `extract` subcommand
|
||||
- After extraction, optionally run `import` on the source to produce the `.md` file
|
||||
- Print a summary: styles preserved, content extracted, paths written
|
||||
|
||||
**Wire into REST and MCP:**
|
||||
- REST: `POST /template/extract` — multipart upload of `source.docx`; returns
|
||||
`TemplateExtractionResult` + download URLs for template and MD
|
||||
- MCP: `extract_template(source_path: str, template_out: str, content_out: str)`
|
||||
|
||||
### End-to-end regression test
|
||||
|
||||
Add `tests/regression/test_word_first_roundtrip.py`:
|
||||
|
||||
```
|
||||
Fixture: tests/regression/fixtures/word_first/source.docx
|
||||
— A representative Word document with headings, body text, a table,
|
||||
an image, and a footer. Committed to the repo as a binary fixture.
|
||||
|
||||
Test: test_word_first_roundtrip
|
||||
1. extract_template(source.docx) → template.docx + content.md
|
||||
2. Assert template.docx has zero body paragraphs
|
||||
3. Assert template.docx preserves at least the styles present in source.docx
|
||||
4. Assert content.md is non-empty and contains the expected headings
|
||||
5. build(manifest pointing at content.md + template.docx) → rebuilt.docx
|
||||
6. import(rebuilt.docx) → reimported.md
|
||||
7. Assert reimported.md is structurally equivalent to content.md
|
||||
(use differ.py; assert zero structural drift)
|
||||
|
||||
Test: test_template_extraction_idempotent
|
||||
1. extract_template(source.docx) → template_a.docx
|
||||
2. extract_template(template_a.docx) → template_b.docx
|
||||
3. Assert template_b has same style set as template_a (extraction of an
|
||||
already-empty template is a no-op)
|
||||
```
|
||||
|
||||
**Fixture creation:**
|
||||
- Create `tests/regression/fixtures/word_first/` directory
|
||||
- Programmatically generate `source.docx` using `python-docx` in a fixture-generator
|
||||
script (`tests/regression/fixtures/word_first/generate.py`) — this keeps the binary
|
||||
reproducible from source
|
||||
- Commit the generated `source.docx` as a stable binary fixture (tracked in git)
|
||||
|
||||
### Success criteria for T06
|
||||
|
||||
1. `markidocx template extract source.docx` produces a valid content-free template
|
||||
and a Markdown content file
|
||||
2. The extracted template + content can be built back into a DOCX via `markidocx build`
|
||||
3. The rebuilt DOCX imports cleanly with zero structural drift against the extracted
|
||||
content
|
||||
4. `test_word_first_roundtrip` passes in CI
|
||||
5. REST and MCP surfaces expose the new capability
|
||||
|
||||
---
|
||||
|
||||
## Execution order
|
||||
|
||||
- T01, T02, T03 are independent — can be worked in any order or in parallel
|
||||
- T04 depends on T02 (the evidence CLI command exposes the assembled set)
|
||||
- T05 is independent — can be worked at any time
|
||||
- T06 is independent of T01–T05 but benefits from T04 (evidence for the rebuild step)
|
||||
|
||||
## Updating task status
|
||||
|
||||
```
|
||||
status: todo → status: in_progress (when you start it)
|
||||
status: in_progress → status: done (when verified complete)
|
||||
```
|
||||
|
||||
When every task is `done`, set the frontmatter `status: done`.
|
||||
|
||||
## Success criteria
|
||||
|
||||
Before marking the workplan done:
|
||||
|
||||
1. Every task block has `status: done`
|
||||
2. Workplan frontmatter `status: done`
|
||||
3. Full test suite passes (`pytest --tb=short -q`)
|
||||
4. `ruff check` and `mypy src/` clean
|
||||
5. `markidocx inspect`, `markidocx test`, `markidocx evidence`, `markidocx template extract`
|
||||
all present and functional
|
||||
6. `GET /styles` returns real style data (not `[]`)
|
||||
7. `markidocx evidence <run_id>` returns an `EvidenceSet` with `overall_status`
|
||||
8. `test_word_first_roundtrip` passes
|
||||
9. LEVEL3 edge-case tests added and passing
|
||||
Reference in New Issue
Block a user