generated from coulomb/repo-seed
epub3 inbound filter
This commit is contained in:
56
workplans/MKTF-WP-0001-epub3-read-adapter.md
Normal file
56
workplans/MKTF-WP-0001-epub3-read-adapter.md
Normal file
@@ -0,0 +1,56 @@
|
||||
---
|
||||
id: MKTF-WP-0001
|
||||
type: workplan
|
||||
title: "EPUB3 Read Adapter"
|
||||
domain: markitect
|
||||
status: done
|
||||
owner: markitect-filter
|
||||
topic_slug: markitect
|
||||
planning_priority: complete
|
||||
planning_order: 10
|
||||
depends_on_workplans:
|
||||
- MKTT-WP-0018
|
||||
created: "2026-05-14"
|
||||
updated: "2026-05-14"
|
||||
---
|
||||
|
||||
# MKTF-WP-0001: EPUB3 Read Adapter
|
||||
|
||||
## Purpose
|
||||
|
||||
Implement the first concrete `markitect-filter` source adapter:
|
||||
`source.epub3`, a read-only EPUB3 adapter that satisfies the
|
||||
`markitect-tool` source adapter contract.
|
||||
|
||||
## Implemented Scope
|
||||
|
||||
- Python package scaffold with `pyproject.toml`.
|
||||
- Entry point group registration:
|
||||
`markitect_tool.source_adapters`.
|
||||
- Lightweight `epub3_adapter_descriptor`.
|
||||
- Stdlib-only EPUB3 package reading with `zipfile` and `ElementTree`.
|
||||
- `META-INF/container.xml` rootfile discovery.
|
||||
- OPF metadata, manifest, and spine extraction.
|
||||
- EPUB nav label extraction.
|
||||
- XHTML body extraction into ordered Markdown segments.
|
||||
- Source provenance with package paths, hrefs, anchors, and section labels.
|
||||
- Structured diagnostics for malformed EPUBs, skipped boilerplate, missing
|
||||
spine items, unsupported media, and malformed XML.
|
||||
- Tests for descriptor shape, matching, inspection, normalization, malformed
|
||||
packages, Markitect API registry use, and entry point shape.
|
||||
|
||||
## Non-Goals
|
||||
|
||||
- PDF, DOCX, ODT, OCR, or browser extraction.
|
||||
- Write/export adapters.
|
||||
- Network fetching.
|
||||
- Styling-preserving conversion.
|
||||
- Image extraction beyond future metadata/attachment handling.
|
||||
|
||||
## Validation
|
||||
|
||||
Run from `markitect-filter`:
|
||||
|
||||
```bash
|
||||
PYTHONPATH=src:/home/worsch/markitect-tool/src python3 -m pytest
|
||||
```
|
||||
Reference in New Issue
Block a user