generated from coulomb/repo-seed
Workplan dependencies and prio for text research lab workplans
This commit is contained in:
76
docs/query-extraction.md
Normal file
76
docs/query-extraction.md
Normal file
@@ -0,0 +1,76 @@
|
||||
# Query And Extraction
|
||||
|
||||
Date: 2026-05-03
|
||||
|
||||
## Purpose
|
||||
|
||||
The first query layer keeps selection close to the structured Markdown model.
|
||||
It is intentionally small and deterministic. JSONPath or another query backend
|
||||
can be added later behind the same API if the simple selector language becomes
|
||||
too limited.
|
||||
|
||||
## CLI
|
||||
|
||||
```text
|
||||
mkt query <document.md> <selector> [--format json|yaml|text]
|
||||
mkt extract <document.md> <selector> [--format text|json|yaml]
|
||||
```
|
||||
|
||||
`query` returns structured matches. `extract` returns textual content from the
|
||||
matches.
|
||||
|
||||
## Selectors
|
||||
|
||||
Supported targets:
|
||||
|
||||
- `document`, `$`, or `.`: full parsed document
|
||||
- `frontmatter`: YAML frontmatter
|
||||
- `headings`: heading objects
|
||||
- `sections`: heading-led sections
|
||||
- `blocks`: parsed content blocks
|
||||
- `metrics`: document and section metrics
|
||||
|
||||
Supported path examples:
|
||||
|
||||
```text
|
||||
frontmatter.status
|
||||
frontmatter.owner.name
|
||||
metrics.document.words
|
||||
metrics.document.sections
|
||||
```
|
||||
|
||||
Supported filters:
|
||||
|
||||
```text
|
||||
headings[level=2]
|
||||
headings[text=Decision]
|
||||
headings[text~=decision]
|
||||
sections[heading=Context]
|
||||
sections[heading~=risk]
|
||||
sections[contains=problem]
|
||||
sections[contains~=PROBLEM]
|
||||
blocks[type=paragraph]
|
||||
blocks[contains~=follow-up]
|
||||
```
|
||||
|
||||
`=` is exact and case-sensitive. `~=` is substring matching and
|
||||
case-insensitive.
|
||||
|
||||
## Current Boundary
|
||||
|
||||
This is not a full query language. It covers practical extraction from the
|
||||
current parser model:
|
||||
|
||||
- frontmatter values
|
||||
- headings
|
||||
- sections
|
||||
- content blocks
|
||||
- metrics
|
||||
|
||||
Future query backend work should preserve this simple surface and add optional
|
||||
adapters rather than forcing every user into a heavier language.
|
||||
|
||||
Advanced query and cache backends are tracked in:
|
||||
|
||||
- `docs/cache-backend-architecture-blueprint.md`
|
||||
- `workplans/MKTT-WP-0007-advanced-query-and-local-index-backend.md`
|
||||
Reference in New Issue
Block a user