generated from coulomb/repo-seed
Add repository scope profile
This commit is contained in:
137
SCOPE.md
Normal file
137
SCOPE.md
Normal file
@@ -0,0 +1,137 @@
|
||||
# SCOPE
|
||||
|
||||
> This file helps humans and agents understand when this repository is useful,
|
||||
> what it owns, and where its boundaries stop.
|
||||
|
||||
---
|
||||
|
||||
## One-liner
|
||||
|
||||
markitect-filter provides concrete source-format adapters that convert EPUB3 and
|
||||
digitally readable PDF inputs into normalized Markitect Markdown documents.
|
||||
|
||||
---
|
||||
|
||||
## Core Idea
|
||||
|
||||
This repo keeps source extraction outside `markitect-tool` while implementing
|
||||
the Markitect source adapter contract. It turns selected external document
|
||||
formats into deterministic, normalized read-side artifacts that the Markitect
|
||||
core can consume without knowing each format's package or file structure.
|
||||
|
||||
---
|
||||
|
||||
## In Scope
|
||||
|
||||
- EPUB3 read adapter descriptor and package-to-Markitect normalization.
|
||||
- PDF read adapter descriptor for local, digitally readable PDF inputs.
|
||||
- Source attachment metadata for EPUB package resources, PDF embedded files,
|
||||
and PDF image-resource signals.
|
||||
- Tests, examples, and docs for adapter contract compatibility.
|
||||
- Entry points under `markitect_tool.source_adapters`.
|
||||
|
||||
---
|
||||
|
||||
## Out of Scope
|
||||
|
||||
- Markitect core document, contract, render, memory, or workflow engines.
|
||||
- OCR, scanned-document recognition, and layout-perfect PDF reconstruction.
|
||||
- Export or rendering behavior.
|
||||
- Remote ingestion services, queues, storage, or production hosting.
|
||||
- Owning the passive render asset manifest contract beyond read-side handoff
|
||||
metadata.
|
||||
|
||||
---
|
||||
|
||||
## Relevant When
|
||||
|
||||
- You need Markitect to ingest EPUB3 or digitally readable PDF sources.
|
||||
- You are testing source adapter descriptors against `markitect-tool`.
|
||||
- You need attachment metadata from source documents for downstream render
|
||||
manifest planning.
|
||||
- You are maintaining EPUB3/PDF normalization fixtures and examples.
|
||||
|
||||
---
|
||||
|
||||
## Not Relevant When
|
||||
|
||||
- The needed behavior belongs in the source adapter contract itself.
|
||||
- The work is Quarkdown rendering or export production.
|
||||
- The input is scanned or image-only PDF requiring OCR.
|
||||
- The task is general Markdown transformation after normalization.
|
||||
|
||||
---
|
||||
|
||||
## Current State
|
||||
|
||||
- Status: active.
|
||||
- Implementation: Python 3.12 package with EPUB3, PDF, attachment metadata,
|
||||
tests, examples, and docs.
|
||||
- Stability: adapter slices are deterministic and test-covered.
|
||||
- Integration: registered in the Custodian State Hub as `markitect-filter`.
|
||||
|
||||
---
|
||||
|
||||
## How It Fits
|
||||
|
||||
- Upstream contract: `markitect-tool` owns the source adapter interfaces and
|
||||
normalized document model.
|
||||
- Downstream consumers: Markitect workflows load this package through source
|
||||
adapter entry points.
|
||||
- Adjacent repo: `markitect-quarkdown` consumes Markitect render/export
|
||||
contracts on the output side.
|
||||
|
||||
---
|
||||
|
||||
## Terminology
|
||||
|
||||
- Preferred terms: source adapter, normalized Markdown document, attachment
|
||||
metadata, EPUB3 package, digitally readable PDF.
|
||||
- Also known as: Markitect filter adapters.
|
||||
- Potentially confusing terms: "filter" means source normalization here, not
|
||||
policy filtering or search result filtering.
|
||||
|
||||
---
|
||||
|
||||
## Related / Overlapping
|
||||
|
||||
- `markitect-tool` - owns core contracts and normalized document APIs.
|
||||
- `markitect-quarkdown` - owns concrete rendering/export through Quarkdown.
|
||||
- `the-custodian` - State Hub registration, workplan tracking, and consistency
|
||||
sync.
|
||||
|
||||
---
|
||||
|
||||
## Getting Oriented
|
||||
|
||||
- Start with: `README.md`, `docs/pdf-adapter.md`, and
|
||||
`docs/source-attachment-metadata.md`.
|
||||
- Key directories: `src/markitect_filter/`, `tests/`, `examples/`, and
|
||||
`workplans/`.
|
||||
- Entry points: `markitect_filter.adapters:epub3_adapter_descriptor` and
|
||||
`markitect_filter.adapters:pdf_adapter_descriptor`.
|
||||
|
||||
---
|
||||
|
||||
## Provided Capabilities
|
||||
|
||||
```capability
|
||||
type: source_adapter
|
||||
title: source.epub3
|
||||
description: Convert EPUB3 packages into normalized Markitect Markdown documents with package resource attachment metadata.
|
||||
keywords: [epub3, source-adapter, markdown, attachments]
|
||||
```
|
||||
|
||||
```capability
|
||||
type: source_adapter
|
||||
title: source.pdf
|
||||
description: Convert local digitally readable PDFs into normalized Markitect Markdown documents with embedded-file and image-resource signals.
|
||||
keywords: [pdf, source-adapter, markdown, attachments]
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Notes
|
||||
|
||||
Run tests with `PYTHONPATH=src:/home/worsch/markitect-tool/src python3 -m
|
||||
pytest` from this checkout.
|
||||
Reference in New Issue
Block a user