Files
markitect-tool/docs/query-extraction.md

77 lines
1.7 KiB
Markdown

# Query And Extraction
Date: 2026-05-03
## Purpose
The first query layer keeps selection close to the structured Markdown model.
It is intentionally small and deterministic. JSONPath or another query backend
can be added later behind the same API if the simple selector language becomes
too limited.
## CLI
```text
mkt query <document.md> <selector> [--format json|yaml|text]
mkt extract <document.md> <selector> [--format text|json|yaml]
```
`query` returns structured matches. `extract` returns textual content from the
matches.
## Selectors
Supported targets:
- `document`, `$`, or `.`: full parsed document
- `frontmatter`: YAML frontmatter
- `headings`: heading objects
- `sections`: heading-led sections
- `blocks`: parsed content blocks
- `metrics`: document and section metrics
Supported path examples:
```text
frontmatter.status
frontmatter.owner.name
metrics.document.words
metrics.document.sections
```
Supported filters:
```text
headings[level=2]
headings[text=Decision]
headings[text~=decision]
sections[heading=Context]
sections[heading~=risk]
sections[contains=problem]
sections[contains~=PROBLEM]
blocks[type=paragraph]
blocks[contains~=follow-up]
```
`=` is exact and case-sensitive. `~=` is substring matching and
case-insensitive.
## Current Boundary
This is not a full query language. It covers practical extraction from the
current parser model:
- frontmatter values
- headings
- sections
- content blocks
- metrics
Future query backend work should preserve this simple surface and add optional
adapters rather than forcing every user into a heavier language.
Advanced query and cache backends are tracked in:
- `docs/cache-backend-architecture-blueprint.md`
- `workplans/MKTT-WP-0007-advanced-query-and-local-index-backend.md`