--- id: MKTF-WP-0003 type: workplan title: "Source Attachment Manifest Compatibility" domain: markitect status: done owner: markitect-filter topic_slug: markitect planning_priority: complete planning_order: 30 depends_on_workplans: - MKTF-WP-0001 - MKTF-WP-0002 related_workplans: - MKTT-WP-0018 - MKTT-WP-0021 - MKTT-WP-0020 created: "2026-05-15" updated: "2026-05-15" state_hub_workstream_id: "16e5c830-31e3-4070-9e27-65d28ed06595" --- # MKTF-WP-0003: Source Attachment Manifest Compatibility ## Purpose Provide the read-side source attachment and asset metadata needed by the Markitect render reference and asset manifest contract without making `markitect-filter` a renderer or export pipeline. `markitect-filter` owns concrete source-format normalization. It should expose attachments, embedded media, package resources, and related provenance as normalized source metadata that `markitect-tool` can consume when building a render asset manifest. ## Boundary `markitect-filter` owns: - source-format-specific attachment discovery - read-side source asset metadata - media type, extension, size, and digest capture where available - provenance back to package paths, pages, anchors, or source members - diagnostics for skipped or unsupported embedded resources - fixtures proving EPUB3 and PDF adapters preserve read-side asset metadata `markitect-filter` does not own: - write/export adapters - renderer execution - output asset copying - final render artifact paths - publication lifecycle or durable artifact storage - Quarkdown invocation Those responsibilities belong to `markitect-tool` contracts, `markitect-quarkdown` render integration, or later runtime/publication systems. ## Implementation Summary Completed as a read-side attachment metadata compatibility slice: - Added shared source attachment metadata helpers and exported `markitect-filter.source-attachment.v1`. - EPUB3 read results now populate `NormalizedMarkdownDocument.attachments` for package images, stylesheets, fonts, audio, and video with byte size, digest, package path, manifest id, href, and render-manifest compatibility metadata. - PDF read results now populate attachments for embedded file streams and signal-only image resources where the stdlib parser can detect them. - Unsupported EPUB resources, missing EPUB resources, PDF image signals, and unreadable embedded files produce structured diagnostics. - Docs, handoff fixtures, adapter descriptors, README notes, and tests were added without introducing renderer/export behavior. ## P3.1 - Align attachment metadata with Markitect source contracts ```task id: MKTF-WP-0003-T001 status: done priority: high state_hub_task_id: "d119daca-8141-4662-8ad7-ce43ccd79044" ``` Confirm how existing `markitect_tool.source.SourceAsset` and `NormalizedMarkdownDocument.attachments` should be populated by concrete read adapters. Output: compatibility note, adapter metadata conventions, and tests that can be run with `markitect-tool` on `PYTHONPATH`. ## P3.2 - Add EPUB3 embedded resource metadata ```task id: MKTF-WP-0003-T002 status: done priority: medium state_hub_task_id: "ebcbf480-210d-46e7-a4e4-fbe7e9baa39a" ``` Extend the EPUB3 adapter to report package resources that are relevant to future render manifests, such as images, stylesheets, fonts, and media where safe and cheap to inspect. Output: EPUB3 attachment metadata, provenance, diagnostics, and fixtures. ## P3.3 - Add PDF attachment and image-signal metadata ```task id: MKTF-WP-0003-T003 status: done priority: medium state_hub_task_id: "d8b7b820-387f-4d45-bf22-296b227f917a" ``` Extend the PDF adapter with read-side metadata for attachments or embedded resource signals where the current dependency profile can expose them reliably. Do not add OCR, layout reconstruction, or renderer behavior. Output: PDF metadata conventions, diagnostics, and tests. ## P3.4 - Preserve checksums and provenance ```task id: MKTF-WP-0003-T004 status: done priority: high state_hub_task_id: "ca539c01-c272-4635-8f60-86f870bbef0c" ``` For each attachment or source asset, preserve stable identity fields: - source URI or package member path - media type and extension - byte size where available - digest/checksum where feasible - page, anchor, section, or package path provenance - extraction diagnostics and quality notes Output: deterministic digest/provenance tests. ## P3.5 - Provide handoff fixtures for render manifests ```task id: MKTF-WP-0003-T005 status: done priority: medium state_hub_task_id: "f2213a20-ce6f-4e16-9b9b-557b99f8b4d1" ``` Add fixtures that `MKTT-WP-0021` can use to prove source attachments flow into render asset manifests without renderer execution. Output: fixture files, README/docs update, and cross-repo validation command. ## Exit Criteria - EPUB3 and PDF read adapters can expose read-side asset metadata when present. - Unsupported or skipped resources produce structured diagnostics. - `markitect-filter` remains read-only and does not implement export/render behavior. - `markitect-tool` can consume the metadata for passive render asset manifests.