# SCOPE > This file helps humans and agents understand when this repository is useful, > what it owns, and where its boundaries stop. --- ## One-liner markitect-filter provides concrete source-format adapters that convert EPUB3 and digitally readable PDF inputs into normalized Markitect Markdown documents. --- ## Core Idea This repo keeps source extraction outside `markitect-tool` while implementing the Markitect source adapter contract. It turns selected external document formats into deterministic, normalized read-side artifacts that the Markitect core can consume without knowing each format's package or file structure. --- ## In Scope - EPUB3 read adapter descriptor and package-to-Markitect normalization. - PDF read adapter descriptor for local, digitally readable PDF inputs. - Source attachment metadata for EPUB package resources, PDF embedded files, and PDF image-resource signals. - Tests, examples, and docs for adapter contract compatibility. - Entry points under `markitect_tool.source_adapters`. --- ## Out of Scope - Markitect core document, contract, render, memory, or workflow engines. - OCR, scanned-document recognition, and layout-perfect PDF reconstruction. - Export or rendering behavior. - Remote ingestion services, queues, storage, or production hosting. - Owning the passive render asset manifest contract beyond read-side handoff metadata. --- ## Relevant When - You need Markitect to ingest EPUB3 or digitally readable PDF sources. - You are testing source adapter descriptors against `markitect-tool`. - You need attachment metadata from source documents for downstream render manifest planning. - You are maintaining EPUB3/PDF normalization fixtures and examples. --- ## Not Relevant When - The needed behavior belongs in the source adapter contract itself. - The work is Quarkdown rendering or export production. - The input is scanned or image-only PDF requiring OCR. - The task is general Markdown transformation after normalization. --- ## Current State - Status: active. - Implementation: Python 3.12 package with EPUB3, PDF, attachment metadata, tests, examples, and docs. - Stability: adapter slices are deterministic and test-covered. - Integration: registered in the Custodian State Hub as `markitect-filter`. --- ## How It Fits - Upstream contract: `markitect-tool` owns the source adapter interfaces and normalized document model. - Downstream consumers: Markitect workflows load this package through source adapter entry points. - Adjacent repo: `markitect-quarkdown` consumes Markitect render/export contracts on the output side. --- ## Terminology - Preferred terms: source adapter, normalized Markdown document, attachment metadata, EPUB3 package, digitally readable PDF. - Also known as: Markitect filter adapters. - Potentially confusing terms: "filter" means source normalization here, not policy filtering or search result filtering. --- ## Related / Overlapping - `markitect-tool` - owns core contracts and normalized document APIs. - `markitect-quarkdown` - owns concrete rendering/export through Quarkdown. - `the-custodian` - State Hub registration, workplan tracking, and consistency sync. --- ## Getting Oriented - Start with: `README.md`, `docs/pdf-adapter.md`, and `docs/source-attachment-metadata.md`. - Key directories: `src/markitect_filter/`, `tests/`, `examples/`, and `workplans/`. - Entry points: `markitect_filter.adapters:epub3_adapter_descriptor` and `markitect_filter.adapters:pdf_adapter_descriptor`. --- ## Provided Capabilities ```capability type: source_adapter title: source.epub3 description: Convert EPUB3 packages into normalized Markitect Markdown documents with package resource attachment metadata. keywords: [epub3, source-adapter, markdown, attachments] ``` ```capability type: source_adapter title: source.pdf description: Convert local digitally readable PDFs into normalized Markitect Markdown documents with embedded-file and image-resource signals. keywords: [pdf, source-adapter, markdown, attachments] ``` --- ## Notes Run tests with `PYTHONPATH=src:/home/worsch/markitect-tool/src python3 -m pytest` from this checkout.