6.3 KiB
id, type, title, domain, status, owner, topic_slug, planning_priority, planning_order, depends_on_workplans, related_workplans, created, updated, state_hub_workstream_id
| id | type | title | domain | status | owner | topic_slug | planning_priority | planning_order | depends_on_workplans | related_workplans | created | updated | state_hub_workstream_id | ||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| MKTT-WP-0007 | workplan | Advanced Query and Local Index Backend | markitect | done | markitect-tool | markitect | P2 | 60 |
|
|
2026-05-03 | 2026-05-04 | d61a82e4-651a-4df2-944a-9ff996b2e1f6 |
MKTT-WP-0007: Advanced Query and Local Index Backend
Purpose
Implement the first practical backend use case: cached AST introspection, JSONPath querying, SQLite metadata, and FTS5 search over Markdown documents.
This backend should later be able to index MKTT-WP-0010 references, named
regions, chunks, and processor provenance without changing its basic storage
contract.
Preliminary Refinement - Snapshot Refresh Planning
Implemented before starting the SQLite/index tasks: SnapshotState,
SnapshotPlanEntry, SnapshotRefreshPlan, plan_snapshot_refresh,
load_snapshot_state_file, and CLI mkt backend refresh-plan.
This is the performance contract for WP-0007:
- compare cheap metadata before hashing
- hash only likely-changed files when
--verify-hashesis requested - parse only files whose identity/content requires a new snapshot
- index only new, changed, unindexed, or dependency-invalidated entries
- carry direct and transitive dependency invalidation forward from
DependencyEdge - keep refresh planning inspectable through JSON/YAML/text output
The future SQLite store should persist enough state to feed this planner directly and should report actual refresh work against the same categories.
P7.1 - Implement local snapshot store
id: MKTT-WP-0007-T001
status: done
priority: high
state_hub_task_id: "8894a9a4-586c-457b-b4e6-add8276ff5f2"
Persist parsed document snapshots and source metadata in a local cache directory.
Implemented: LocalSnapshotStore, SQLite schema initialization, source-state
loading, parsed document JSON persistence, provenance envelope storage, and
relative path handling. See docs/local-index-backend.md.
Implementation hints:
- Persist
SnapshotStatefields in the snapshot/source tables. - Store path, size, mtime, content hash, parser id/version, parse options hash, contract hash, snapshot id, indexed flag, and dependency edges.
- Keep large document/token JSON lazy-loadable so refresh planning does not pull whole AST payloads into memory.
P7.2 - Add AST introspection commands
id: MKTT-WP-0007-T002
status: done
priority: high
state_hub_task_id: "fb9eaa9d-5c20-49a9-a7a6-acae28ac5e20"
Add:
mkt ast show <file>
mkt ast stats <file>
Use the current parsed document and token model. Do not require cache presence for single-file use.
Implemented: mkt ast show <file> and mkt ast stats <file> with JSON, YAML,
tree/text output modes.
P7.3 - Add optional JSONPath query adapter
id: MKTT-WP-0007-T003
status: done
priority: high
state_hub_task_id: "a7b46b32-f322-4fe0-a6fb-60b0b823593c"
Support JSONPath over Document.to_dict() behind an optional dependency and
shared query result envelope.
Implemented: query_document_jsonpath() and extract_document_jsonpath() use
the optional jsonpath-ng dependency and return the same QueryMatch envelope
as the compact selector engine. CLI mkt query and mkt extract accept
--engine jsonpath.
P7.4 - Build SQLite metadata and JSON index
id: MKTT-WP-0007-T004
status: done
priority: medium
state_hub_task_id: "479f11a3-4ab4-451b-991c-7f143f2bffea"
Persist source files, content hashes, frontmatter, headings, sections, blocks, and metrics in SQLite.
Keep schema extension points for reference edges, named regions, chunks, and processor outputs.
Implementation hints:
- Use narrow metadata tables for hot refresh decisions.
- Store document/token JSON separately from searchable section/block rows.
- Add indexes on path, content hash, snapshot id, parser version, and unit ids.
- Preserve source spans and content-unit ids from WP-0010 reference/literate layers.
Implemented: source, heading, section, block, dependency, and metadata tables; document/frontmatter/metrics/provenance JSON payloads; hot-path indexes on path, content hash, snapshot id, parser identity, unit path, and dependency target.
P7.5 - Add FTS5 section/block search
id: MKTT-WP-0007-T005
status: done
priority: medium
state_hub_task_id: "0f03e9be-b6f0-4e4b-8220-3bbf638a892b"
Add full-text search over section and block text with source spans and relevance ranking.
Implemented: local SQLite index creates an FTS5 search_units virtual table
for sections and blocks, including path, snapshot id, unit kind/index, heading,
text, source spans, and BM25 rank. CLI mkt search <text> queries it.
P7.6 - Add incremental refresh
id: MKTT-WP-0007-T006
status: done
priority: medium
state_hub_task_id: "7d9472e6-0716-435b-866c-d2c66ad786cf"
Refresh only changed files based on content hash and parser version.
Include dependency invalidation hooks for future transclusion/reference graphs.
Implementation hints:
- Drive incremental refresh from
SnapshotRefreshPlan. - The first pass should use cheap metadata; only hash metadata-changed files.
- With
--verify-hashes, skip parse/index when content is unchanged and only update metadata. - Use reverse dependency edges for direct and transitive invalidation.
- Report planned vs actual counts for hash, parse, index, metadata update, delete, and invalidation work.
Implemented first pass: LocalSnapshotStore.build() drives refresh from
SnapshotRefreshPlan, hashes metadata-changed files by default, skips
unchanged content, updates metadata-only rows, refreshes changed snapshots, and
deletes removed files.
P7.7 - Add local index CLI
id: MKTT-WP-0007-T007
status: done
priority: high
state_hub_task_id: "35cc63ff-3723-43d5-aaf6-f9312efa0f4b"
Add:
mkt cache init
mkt cache build <path>
mkt cache query <selector-or-query>
mkt search <text>
Implemented:
mkt cache initmkt cache index <path>mkt cache query <selector-or-query>mkt search <text>
The older lightweight manifest commands remain available as mkt cache build,
mkt cache status, and mkt cache fingerprint.
Exit Criteria
- Legacy AST/JSONPath value is recovered as an optional backend.
- Local repeated queries are faster and explainable.
- Simple selectors still work without cache.