9.2 KiB
id, type, title, domain, status, owner, topic_slug, planning_priority, planning_order, depends_on_workplans, related_workplans, created, updated, state_hub_workstream_id
| id | type | title | domain | status | owner | topic_slug | planning_priority | planning_order | depends_on_workplans | related_workplans | created | updated | state_hub_workstream_id | |||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| MKTT-WP-0019 | workplan | Source Adapter Contract Refinement | markitect | done | markitect-tool | markitect | complete | 142 |
|
|
2026-05-14 | 2026-05-14 | 10a85934-a4b2-4661-83f7-92ac8d322af4 |
MKTT-WP-0019: Source Adapter Contract Refinement
Purpose
Refine the source adapter contract before implementing
MKTT-WP-0018. The goal is to remove the remaining ambiguity in the external
adapter surface so markitect-tool can implement the framework and
markitect-filter can implement EPUB3 without guessing about model fields,
entry points, CLI behavior, or contract-test expectations.
This is a short gating workplan. It should produce decisions, documentation,
and test fixtures that make MKTT-WP-0018 implementation straightforward.
Background
MKTT-WP-0018 establishes the correct architecture boundary:
markitect-tool -> contracts, normalized markdown model, registry, CLI/API
markitect-filter -> concrete source-format adapters, EPUB3 first
The boundary is sound, but a feasibility review found that the implementation workplan still leaves several decisions too implicit:
- the existing internal extension framework does not yet define external package entry point discovery
- the normalized source-to-markdown model names are listed, but field-level contracts and serialization rules are not pinned
- v1 should be read-only, with write/export support reserved for a later format-by-format decision
- CLI/API output envelopes, adapter selection, and unsupported-format behavior need deterministic contracts
markitect-filterneeds a concrete handoff shape for its first EPUB3 adapter
Decision
Add a refinement pass ahead of MKTT-WP-0018. This workplan should define the
minimum stable v1 contract and explicitly defer nonessential scope.
The v1 source adapter contract should be:
- read-only
- deterministic
- local-file-first, with URI support documented as future or explicitly scoped
- discoverable through a named package entry point group
- serializable without heavyweight optional format dependencies
- testable through fake adapters and small fixtures
Non-Goals
- Do not implement EPUB3 parsing here.
- Do not implement the full
markitect-toolsource adapter framework here. - Do not add PDF, DOCX, ODT, OCR, or browser dependencies.
- Do not design write/export adapters beyond recording the future extension point.
- Do not make
markitect-filtera knowledge platform or ingestion service.
P19.1 - Pin v1 scope and external adapter package shape
id: MKTT-WP-0019-T001
status: done
priority: high
state_hub_task_id: "7ecc6976-c549-47ba-9a16-4d55d1173b41"
Define the v1 source adapter scope:
- read adapters only
- local filesystem inputs first
- explicit future status for URI inputs, binary attachments, and write adapters
- expected external package layout for
markitect-filter - dependency policy for optional format libraries
- compatibility expectations between
markitect-tooland adapter packages
Output: concise architecture note or source-adapter contract section that
MKTT-WP-0018 can implement directly.
Implemented: docs/source-adapter-contract.md defines the v1 read-only scope,
local-file-first posture, external package shape, optional dependency policy,
and compatibility boundary for markitect-filter.
P19.2 - Specify normalized data model fields and serialization
id: MKTT-WP-0019-T002
status: done
priority: high
state_hub_task_id: "7b164d67-8374-4aea-9948-f54912ef4cf5"
Specify the field-level v1 model for:
SourceAssetSourceMetadataNormalizedMarkdownDocumentNormalizedMarkdownSegmentSourceProvenanceNormalizationQuality- adapter diagnostics using the existing
Diagnostic/SourceLocationshape - optional asset reference envelopes, if needed for v1
The specification should define required vs optional fields, stable dict/JSON serialization, digest/cache-key inputs, segment ordering, segment IDs, headings, anchors, source hrefs, page/section references, and adapter metadata.
Output: model contract documentation and fixture-shaped examples.
Implemented: docs/source-adapter-contract.md pins field-level model contracts
for source assets, metadata, provenance, segments, normalized documents, and
quality. examples/source-adapters/normalized-document.json and
examples/source-adapters/normalized-output.md provide fixture-shaped
examples.
P19.3 - Specify read adapter protocol and selection semantics
id: MKTT-WP-0019-T003
status: done
priority: high
state_hub_task_id: "f7cc1956-a6f3-4181-b4df-786cbba39198"
Define the v1 read protocol:
- request/result type names and fields
can_read,inspect, andreadmethod signatures- media type and file extension matching rules
- adapter option schema conventions
- malformed-source and unsupported-format diagnostics
- deterministic adapter selection when multiple adapters match
- behavior when optional adapter dependencies are missing
Output: protocol contract that can be implemented as Python Protocol
classes in MKTT-WP-0018.
Implemented: docs/source-adapter-contract.md defines the v1
SourceReadAdapter protocol, request/result names, option handling, adapter
selection semantics, and deterministic diagnostics for unsupported, malformed,
and dependency-missing inputs.
P19.4 - Define package entry point and registry contract
id: MKTT-WP-0019-T004
status: done
priority: high
state_hub_task_id: "5db7448c-c0d0-48eb-8e44-9f694782af7f"
Define how external source adapter packages register with markitect-tool:
- entry point group name, initially
markitect_tool.source_adapters - expected entry point object shape
- descriptor ID and versioning rules
- relationship between source adapter descriptors and
ExtensionDescriptor - duplicate descriptor handling
- dependency diagnostics for missing optional format libraries
- compatibility notes for separately versioned packages
Output: discovery contract and fake entry point test plan for
MKTT-WP-0018.
Implemented: docs/source-adapter-contract.md defines the
markitect_tool.source_adapters entry point group, accepted entry point object
shapes, descriptor mapping to ExtensionDescriptor, duplicate handling, and
dependency diagnostics. examples/source-adapters/fake-adapter-pyproject.toml
provides the fake entry point fixture.
P19.5 - Pin CLI/API output envelopes and exit behavior
id: MKTT-WP-0019-T005
status: done
priority: medium
state_hub_task_id: "b57a2fd1-e528-4481-b11b-12b15979a85f"
Specify the public source commands and library functions:
mkt source adaptersmkt source inspect <path>mkt source normalize <path> --format markdown- JSON output for adapters, inspection, normalization, and diagnostics
- Markdown output for normalized document content
- adapter selection and explicit adapter override options
- exit behavior for unsupported, malformed, or dependency-missing inputs
- public API names that should be exported from
markitect_tool
Output: CLI/API contract note and expected-output fixtures.
Implemented: docs/source-adapter-contract.md pins the mkt source command
surface, formats, options, exit behavior, and public API export names.
examples/source-adapters/adapter-list.json and
examples/source-adapters/inspect-result.json provide expected-output
fixtures.
P19.6 - Prepare contract-test and markitect-filter handoff criteria
id: MKTT-WP-0019-T006
status: done
priority: high
state_hub_task_id: "a7cb10fd-e1bd-4aee-81af-c93f09496ff8"
Define the contract tests that MKTT-WP-0018 must implement:
- fake in-tree adapter for core behavior
- fake external adapter package or monkeypatched entry point for discovery
- serialization round trips for normalized model fixtures
- unsupported-format and missing-dependency diagnostics
- CLI JSON and Markdown output fixtures
- reusable adapter conformance expectations for
markitect-filter
Also seed the markitect-filter handoff:
- expected package entry point declaration
- first EPUB3 adapter descriptor shape
- minimal fixture expectations for EPUB3 spine/nav/body extraction
- follow-up workplan seed for
markitect-filterimplementation
Output: contract-test checklist and handoff note.
Implemented: docs/source-adapter-contract.md includes the WP0018 contract
test checklist and the first markitect-filter EPUB3 handoff descriptor,
fixture expectations, and extraction responsibilities.
Acceptance
MKTT-WP-0018has no unresolved v1 contract ambiguity around model fields, read protocol shape, entry point discovery, CLI/API output, or fake adapter tests.- v1 is explicitly read-only; write/export support is deferred to a later workplan.
- External adapter discovery has a named entry point group and descriptor object contract.
markitect-filterhas enough handoff detail to implement EPUB3 without importing implementation decisions frominfospace-bench.- The existing
MKTT-WP-0018workplan is updated to depend on this refinement pass and to reference the pinned decisions rather than reopening them.