markitect-tool/docs/query-extraction.md

# Query And Extraction

Date: 2026-05-03

## Purpose

The first query layer keeps selection close to the structured Markdown model.
It is intentionally small and deterministic. JSONPath or another query backend
can be added later behind the same API if the simple selector language becomes
too limited.

## CLI

```text
mkt query <document.md> <selector> [--format json|yaml|text]
mkt extract <document.md> <selector> [--format text|json|yaml]
```

`query` returns structured matches. `extract` returns textual content from the
matches.

## Selectors

Supported targets:

- `document`, `$`, or `.`: full parsed document
- `frontmatter`: YAML frontmatter
- `headings`: heading objects
- `sections`: heading-led sections
- `blocks`: parsed content blocks
- `metrics`: document and section metrics

Supported path examples:

```text
frontmatter.status
frontmatter.owner.name
metrics.document.words
metrics.document.sections
```

Supported filters:

```text
headings[level=2]
headings[text=Decision]
headings[text~=decision]
sections[heading=Context]
sections[heading~=risk]
sections[contains=problem]
sections[contains~=PROBLEM]
blocks[type=paragraph]
blocks[contains~=follow-up]
```

`=` is exact and case-sensitive. `~=` is substring matching and
case-insensitive.

## Current Boundary

This is not a full query language. It covers practical extraction from the
current parser model:

- frontmatter values
- headings
- sections
- content blocks
- metrics

Future query backend work should preserve this simple surface and add optional
adapters rather than forcing every user into a heavier language.

Advanced query and cache backends are tracked in:

- `docs/cache-backend-architecture-blueprint.md`
- `workplans/MKTT-WP-0007-advanced-query-and-local-index-backend.md`