generated from coulomb/repo-seed
- corpus/markidocx-docs/manifest.yaml: specs as live markidocx project (FR-1101) - corpus/markidocx-docs/known-drift.md: documented structural drift - workflows.py: release-regression accepts manifest path; emits corpus_id (FR-1109) - tests/regression/test_corpus_regression.py: corpus regression suite (FR-1102–1110) - architecture/ADR-002: python-docx as conversion engine - architecture/ADR-003: manifest YAML schema - workplans/MRKD-WP-0004: T01–T04 done; T05 blocked (SBOM path mapping needed) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
192 lines
7.0 KiB
Markdown
192 lines
7.0 KiB
Markdown
---
|
||
id: MRKD-WP-0004
|
||
type: workplan
|
||
domain: markitect
|
||
repo: marki-docx
|
||
status: active
|
||
state_hub_workstream_id: 91d06c92-caa8-42fc-b6d4-82340f1bed4f
|
||
created: 2026-03-16
|
||
updated: 2026-03-16
|
||
---
|
||
|
||
# MRKD-WP-0004 — Stable Documentation Corpus & Architecture Records
|
||
|
||
Fulfil FR-1101 by establishing the markidocx product documentation itself as a real,
|
||
managed markidocx project. The specs (PRD, FRS, UCC) become a live round-trip corpus
|
||
that the `release-regression` workflow runs against on every release. This workstream
|
||
also writes the two deferred architecture decision records and generates the first SBOM.
|
||
|
||
**Scope:** FR-1101–1110 (stable corpus & self-test), ADR-002, ADR-003, SBOM
|
||
**Out of scope:** diagram rendering, packaging, CI/CD — addressed in WP-0005/0006
|
||
**Depends on:** MRKD-WP-0001, MRKD-WP-0002, MRKD-WP-0003 — all complete
|
||
|
||
---
|
||
|
||
## T01 — Set up specs as real markidocx project manifest
|
||
|
||
```task
|
||
id: MRKD-WP-0004-T01
|
||
status: done
|
||
priority: high
|
||
state_hub_task_id: f1a36613-ceaa-4786-ac39-cd3a7fd1c142
|
||
```
|
||
|
||
Create a manifest file that treats the markidocx product documentation as a live
|
||
markidocx project. This makes the specs the stable corpus for regression testing
|
||
as required by FR-1101.
|
||
|
||
- Create `corpus/markidocx-docs/manifest.yaml`:
|
||
- `project.name: markidocx-docs`
|
||
- `project.feature_level: level1`
|
||
- `project.family: article`
|
||
- `sources`: PRD, FRS v0.2, UCC (relative paths into `specs/`)
|
||
- `output.dir: corpus/markidocx-docs/dist`
|
||
- Run `markidocx validate corpus/markidocx-docs/manifest.yaml` — must exit 0
|
||
- Run `markidocx build corpus/markidocx-docs/manifest.yaml` — must produce valid DOCX
|
||
- Run `markidocx import` + `markidocx compare` — must report clean or expected drift only
|
||
- Document any structural drift in `corpus/markidocx-docs/known-drift.md`
|
||
|
||
Deliverable: `markidocx build corpus/markidocx-docs/manifest.yaml` succeeds; a DOCX
|
||
of the product documentation exists in `corpus/markidocx-docs/dist/`.
|
||
|
||
---
|
||
|
||
## T02 — Wire release-regression workflow against specs corpus
|
||
|
||
```task
|
||
id: MRKD-WP-0004-T02
|
||
status: done
|
||
priority: high
|
||
state_hub_task_id: f17e959f-28da-4386-9004-b5e036054b06
|
||
```
|
||
|
||
Connect the `release-regression` composite workflow to the real documentation corpus
|
||
so that `markidocx workflow release-regression corpus/markidocx-docs/manifest.yaml`
|
||
runs a full build → import → compare cycle and records evidence (FR-1102, FR-1103,
|
||
FR-1106, FR-1107).
|
||
|
||
- Update `workflows.py` `release-regression` handler to accept a manifest path argument;
|
||
default to the corpus manifest when none supplied
|
||
- Run the workflow; assert the evidence set contains build, import, and drift reports
|
||
- Add `tests/regression/test_corpus_regression.py`:
|
||
- Invokes `release-regression` on the corpus manifest
|
||
- Asserts workflow result is `full` or `with-fallback` (not `failed`)
|
||
- Asserts evidence artefacts are present and have correct traceability fields (FR-1110)
|
||
- Disclose corpus identity in regression output (FR-1109): include corpus manifest path
|
||
and its git HEAD SHA as `corpus_id` in the workflow result
|
||
|
||
Deliverable: `pytest tests/regression/test_corpus_regression.py` passes; evidence
|
||
written to `.markidocx/evidence/` and retrievable via CLI.
|
||
|
||
---
|
||
|
||
## T03 — ADR-002: python-docx as conversion engine
|
||
|
||
```task
|
||
id: MRKD-WP-0004-T03
|
||
status: done
|
||
priority: medium
|
||
state_hub_task_id: bfe2a9fa-25b2-4b4b-b21b-eae457716ce0
|
||
```
|
||
|
||
Write the architecture decision record explaining the choice of python-docx as the
|
||
DOCX conversion engine. This was identified as a deferred deliverable during WP-0001.
|
||
|
||
File: `architecture/ADR-002-python-docx-as-conversion-engine.md`
|
||
|
||
Cover:
|
||
- **Context:** need to produce and consume .docx files from Python; alternatives evaluated
|
||
(pandoc subprocess, docx2python, mammoth, python-docx)
|
||
- **Decision:** python-docx for both build (write) and import (read)
|
||
- **Consequences:** direct paragraph/run model maps cleanly to Markdown structure;
|
||
no subprocess dependency; limited to Open XML subset exposed by python-docx API;
|
||
complex Word features (track changes, SmartArt) are out of scope by design
|
||
- **Alternatives rejected:** pandoc — heavier dependency, harder to control structure;
|
||
mammoth — read-only; docx2python — limited write support
|
||
|
||
Deliverable: `architecture/ADR-002-*.md` present and follows ADR-001 conventions.
|
||
|
||
---
|
||
|
||
## T04 — ADR-003: manifest YAML schema
|
||
|
||
```task
|
||
id: MRKD-WP-0004-T04
|
||
status: done
|
||
priority: medium
|
||
state_hub_task_id: b6de6733-b332-4efc-9e23-82fce205b856
|
||
```
|
||
|
||
Write the architecture decision record documenting the manifest YAML schema design.
|
||
|
||
File: `architecture/ADR-003-manifest-yaml-schema.md`
|
||
|
||
Cover:
|
||
- **Context:** need a project definition format that is human-writable, version-controlled,
|
||
and parseable without a schema registry
|
||
- **Decision:** YAML with a fixed top-level structure (`project`, `sources`, `output`,
|
||
`metadata`); validated on load via dataclass coercion
|
||
- **Schema snapshot:** include the current field definitions as a reference
|
||
- **Consequences:** simple for users; no JSON Schema or Pydantic dependency; evolving
|
||
the schema requires coordination with manifest.py
|
||
- **Alternatives rejected:** TOML (less familiar in doc tooling), JSON (less writable),
|
||
a database manifest (over-engineered for single-project use)
|
||
|
||
Deliverable: `architecture/ADR-003-*.md` present.
|
||
|
||
---
|
||
|
||
## T05 — SBOM generation and state-hub registration
|
||
|
||
```task
|
||
id: MRKD-WP-0004-T05
|
||
status: blocked
|
||
blocking_reason: ops-bridge ingest_sbom_tool cannot access /home/tegwick/ paths (runs as worsch). Configure host_paths mapping for marki-docx, then re-run ingest.
|
||
priority: medium
|
||
state_hub_task_id: 36aecd50-8176-4122-9706-a8697d8f5936
|
||
```
|
||
|
||
Generate and register the first SBOM for marki-docx so the state hub has an accurate
|
||
dependency picture.
|
||
|
||
```bash
|
||
cd ~/the-custodian/state-hub
|
||
make ingest-sbom REPO=marki-docx SCAN=1 REPO_PATH=/home/tegwick/marki-docx
|
||
```
|
||
|
||
- Verify the SBOM ingestion completes without errors
|
||
- Confirm `last_sbom_at` is set for `marki-docx` in the state hub
|
||
- Document any licence issues or unexpected transitive dependencies
|
||
- Add a note to CLAUDE.md reminding to re-run SBOM after dependency changes
|
||
|
||
Deliverable: State hub shows `last_sbom_at` set for `marki-docx`; no unresolved
|
||
licence issues.
|
||
|
||
---
|
||
|
||
## How to Work
|
||
|
||
- Work through tasks in priority order: T01 → T02 (high), then T03 → T04 → T05 (medium)
|
||
- T01 must complete before T02 (T02 depends on the corpus manifest)
|
||
- T03 and T04 are independent writing tasks — can be done in any order or in parallel
|
||
|
||
## Updating Task Status
|
||
|
||
```
|
||
status: todo → status: in_progress (when you start it)
|
||
status: in_progress → status: done (when verified complete)
|
||
```
|
||
|
||
When every task is `done`, set the frontmatter `status: done`.
|
||
|
||
## Success Criteria
|
||
|
||
Before marking the workplan done:
|
||
|
||
1. Every task block has `status: done`
|
||
2. Workplan frontmatter `status: done`
|
||
3. `corpus/markidocx-docs/manifest.yaml` present and builds cleanly
|
||
4. `pytest tests/regression/test_corpus_regression.py` passes
|
||
5. `architecture/ADR-002-*.md` and `architecture/ADR-003-*.md` present
|
||
6. State hub shows `last_sbom_at` set for `marki-docx`
|