- corpus/markidocx-docs/manifest.yaml: specs as live markidocx project (FR-1101) - corpus/markidocx-docs/known-drift.md: documented structural drift - workflows.py: release-regression accepts manifest path; emits corpus_id (FR-1109) - tests/regression/test_corpus_regression.py: corpus regression suite (FR-1102–1110) - architecture/ADR-002: python-docx as conversion engine - architecture/ADR-003: manifest YAML schema - workplans/MRKD-WP-0004: T01–T04 done; T05 blocked (SBOM path mapping needed) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
3.7 KiB
id, type, status, created, deciders
| id | type | status | created | deciders | ||
|---|---|---|---|---|---|---|
| ADR-003 | adr | accepted | 2026-03-16 |
|
ADR-003: Manifest YAML Schema
Status
Accepted
Context
markidocx needs a project definition format that:
- Describes which Markdown source files form a document project
- Declares the feature level (
level1/level3) and document family (article,book,website) - Specifies output location and document metadata
- Is human-writable and version-controllable alongside source files
- Is parseable by the system without a schema registry or external validator
The format must support single-file and multi-file projects, and be extensible enough for future additions (e.g. bibliography sources, asset directories) without breaking existing manifests.
Decision
Use YAML with a fixed four-section top-level structure:
project:
name: <string>
feature_level: level1 | level3
family: article | book | website
sources:
- path: <relative path to .md file>
- path: <relative path to .md file>
output:
dir: <relative path to output directory>
metadata:
title: <string>
author: <string>
date: <string>
All paths are resolved relative to the manifest file's location. The metadata
section and individual source path keys may be extended in future versions.
Validation is performed on load by manifest.py using dataclass coercion:
load_manifest(path) raises ManifestError on any schema violation (missing
required fields, unknown feature levels, unresolvable source paths).
Current Field Definitions
| Field | Type | Required | Default | Notes |
|---|---|---|---|---|
project.name |
string | yes | — | Project identifier; used in output filenames |
project.feature_level |
enum | yes | — | level1 or level3 |
project.family |
enum | yes | — | article, book, or website |
sources[].path |
string | yes | — | Relative path; resolved against manifest dir |
output.dir |
string | no | ./dist |
Relative path for generated artefacts |
metadata.title |
string | no | — | Propagated to DOCX document properties |
metadata.author |
string | no | — | Propagated to DOCX document properties |
metadata.date |
string | no | — | Propagated to DOCX document properties |
Consequences
Positive:
- Human-readable and diff-friendly; natural fit for version-controlled documentation repositories
- No external schema validation library needed —
manifest.pyowns validation - Simple enough for a first-time user to write by hand
- Relative paths keep manifests portable across machines
Negative / accepted limitations:
- Evolving the schema requires coordination between the manifest file format and
manifest.py— there is no formal schema version field - No auto-completion support in editors without a JSON Schema / YAML Language Server configuration (out of scope for v0.1)
- YAML's implicit type coercion can surprise users (e.g. bare
noparsed asFalse);load_manifestvalidates all fields explicitly to catch these cases
Alternatives Rejected
TOML — good alternative, but YAML is more common in documentation tooling (MkDocs, GitHub Actions, Kubernetes) and more familiar to the target audience.
JSON — less writable for humans; comments not supported; trailing commas disallowed; less pleasant for multi-line string values.
Database / registry — over-engineered for the single-project use case; would require a running service just to define a document project.
Pydantic / JSON Schema — considered for validation, but adds a dependency
for functionality that a handful of explicit checks in load_manifest() already
covers cleanly.