SQLite-backed local snapshot store

This commit is contained in:
2026-05-04 08:56:41 +02:00
parent 0d1ad21a9f
commit 36ff4cedab
7 changed files with 926 additions and 5 deletions

View File

@@ -0,0 +1,87 @@
# Local Index Backend
`markitect-tool` now includes a local SQLite snapshot/index backend as the
first practical implementation of the optional backend fabric.
## Purpose
The local index is optimized for repeatable Markdown infrastructure work:
- persist parsed document snapshots
- keep cheap source metadata for incremental refresh planning
- store document JSON for later AST/JSONPath use
- index frontmatter, headings, sections, blocks, and metrics
- preserve extension points for dependency edges, references, named regions,
chunks, processor outputs, FTS, and policy-aware access
The backend is optional. Single-file commands such as `mkt parse`, `mkt query`,
and `mkt ast` do not require it.
## Commands
Initialize the SQLite store:
```text
mkt cache init --root .
```
Build or refresh the local index:
```text
mkt cache index docs workplans --root .
```
Inspect a parsed AST without using the cache:
```text
mkt ast show docs/backend-fabric.md --format tree
mkt ast stats docs/backend-fabric.md
```
By default, the index is written to:
```text
.markitect/cache/index.sqlite3
```
Use `--index-path` to override it.
## Refresh Behavior
`mkt cache index` uses the same cheap-first refresh planning model as
`mkt backend refresh-plan`:
1. Compare path, size, mtime, parser identity, parse options, and contract hash.
2. Hash only files whose metadata changed.
3. Skip parse/index when metadata changed but content hash stayed the same.
4. Parse and index new or changed files.
5. Delete rows for removed source files.
The command reports planned work and actual work separately in JSON/YAML output.
## Stored Data
The first schema stores:
- `sources`: path, absolute path, size, mtime, content hash, snapshot id,
parser identity, parse option hash, contract hash, document JSON,
frontmatter JSON, metrics JSON, provenance JSON, and indexed flag
- `headings`: heading level, text, and source line
- `sections`: heading metadata, section text, and source span
- `blocks`: block type, text, source span, and heading level
- `dependencies`: reserved dependency edge table for references,
transclusion, literate chunks, and future invalidation graphs
This is enough to recover the useful markitect-main idea of keeping parsed
structure available for faster and richer query backends, while keeping the
normal CLI usable without a cache.
## Future Work
`MKTT-WP-0007` still needs:
- JSONPath query adapter over stored or live document JSON
- FTS5 search over section/block rows
- cache-backed query commands
- richer dependency extraction from references, transclusion, and literate
chunks

View File

@@ -33,7 +33,7 @@ and descriptions mirror the operational view.
| `MKTT-WP-0003` | complete | done | `MKTT-WP-0001`, `MKTT-WP-0002`, `MKTT-WP-0004` | Core toolkit implementation is complete. |
| `MKTT-WP-0006` | complete | done | `MKTT-WP-0004`; task-level trigger: `MKTT-WP-0003-T005` | Optional backend fabric is complete: manifests, capabilities, snapshot identity, interfaces, registry, provenance, and read-only CLI scaffolding. |
| `MKTT-WP-0010` | complete | done | `MKTT-WP-0004`; task-level trigger: `MKTT-WP-0003-T006` | Content references, processors, explode/implode, weave/tangle, content classes, and migration examples are complete as the first WP-0010 extension layer. |
| `MKTT-WP-0007` | P2 | todo | `MKTT-WP-0006` | First practical cache backend use case: AST/JSONPath/SQLite/FTS. Preliminary refresh planning is in place as the performance contract. |
| `MKTT-WP-0007` | P2 | todo | `MKTT-WP-0006` | First practical cache backend use case: AST/JSONPath/SQLite/FTS. SQLite snapshots, AST inspection, metadata indexing, and incremental refresh are in place; JSONPath, FTS, and cache-backed query remain. |
| `MKTT-WP-0005` | P2 | todo | `MKTT-WP-0003`, `MKTT-WP-0004` | Pick up when generation/form/context or semantic assessment pressure appears. |
| `MKTT-WP-0011` | P2 | todo | `MKTT-WP-0003`; task-level triggers: `MKTT-WP-0010-T001`, `MKTT-WP-0010-T005` | Declarative Markdown dataflow workflows: source extraction, deterministic/assisted processing, and multi-output generation. |
| `MKTT-WP-0009` | P2 | todo | `MKTT-WP-0006` | Establish access-control gateway before security-sensitive cache/context use. |