generated from coulomb/repo-seed
SQLite-backed local snapshot store
This commit is contained in:
87
docs/local-index-backend.md
Normal file
87
docs/local-index-backend.md
Normal file
@@ -0,0 +1,87 @@
|
||||
# Local Index Backend
|
||||
|
||||
`markitect-tool` now includes a local SQLite snapshot/index backend as the
|
||||
first practical implementation of the optional backend fabric.
|
||||
|
||||
## Purpose
|
||||
|
||||
The local index is optimized for repeatable Markdown infrastructure work:
|
||||
|
||||
- persist parsed document snapshots
|
||||
- keep cheap source metadata for incremental refresh planning
|
||||
- store document JSON for later AST/JSONPath use
|
||||
- index frontmatter, headings, sections, blocks, and metrics
|
||||
- preserve extension points for dependency edges, references, named regions,
|
||||
chunks, processor outputs, FTS, and policy-aware access
|
||||
|
||||
The backend is optional. Single-file commands such as `mkt parse`, `mkt query`,
|
||||
and `mkt ast` do not require it.
|
||||
|
||||
## Commands
|
||||
|
||||
Initialize the SQLite store:
|
||||
|
||||
```text
|
||||
mkt cache init --root .
|
||||
```
|
||||
|
||||
Build or refresh the local index:
|
||||
|
||||
```text
|
||||
mkt cache index docs workplans --root .
|
||||
```
|
||||
|
||||
Inspect a parsed AST without using the cache:
|
||||
|
||||
```text
|
||||
mkt ast show docs/backend-fabric.md --format tree
|
||||
mkt ast stats docs/backend-fabric.md
|
||||
```
|
||||
|
||||
By default, the index is written to:
|
||||
|
||||
```text
|
||||
.markitect/cache/index.sqlite3
|
||||
```
|
||||
|
||||
Use `--index-path` to override it.
|
||||
|
||||
## Refresh Behavior
|
||||
|
||||
`mkt cache index` uses the same cheap-first refresh planning model as
|
||||
`mkt backend refresh-plan`:
|
||||
|
||||
1. Compare path, size, mtime, parser identity, parse options, and contract hash.
|
||||
2. Hash only files whose metadata changed.
|
||||
3. Skip parse/index when metadata changed but content hash stayed the same.
|
||||
4. Parse and index new or changed files.
|
||||
5. Delete rows for removed source files.
|
||||
|
||||
The command reports planned work and actual work separately in JSON/YAML output.
|
||||
|
||||
## Stored Data
|
||||
|
||||
The first schema stores:
|
||||
|
||||
- `sources`: path, absolute path, size, mtime, content hash, snapshot id,
|
||||
parser identity, parse option hash, contract hash, document JSON,
|
||||
frontmatter JSON, metrics JSON, provenance JSON, and indexed flag
|
||||
- `headings`: heading level, text, and source line
|
||||
- `sections`: heading metadata, section text, and source span
|
||||
- `blocks`: block type, text, source span, and heading level
|
||||
- `dependencies`: reserved dependency edge table for references,
|
||||
transclusion, literate chunks, and future invalidation graphs
|
||||
|
||||
This is enough to recover the useful markitect-main idea of keeping parsed
|
||||
structure available for faster and richer query backends, while keeping the
|
||||
normal CLI usable without a cache.
|
||||
|
||||
## Future Work
|
||||
|
||||
`MKTT-WP-0007` still needs:
|
||||
|
||||
- JSONPath query adapter over stored or live document JSON
|
||||
- FTS5 search over section/block rows
|
||||
- cache-backed query commands
|
||||
- richer dependency extraction from references, transclusion, and literate
|
||||
chunks
|
||||
@@ -33,7 +33,7 @@ and descriptions mirror the operational view.
|
||||
| `MKTT-WP-0003` | complete | done | `MKTT-WP-0001`, `MKTT-WP-0002`, `MKTT-WP-0004` | Core toolkit implementation is complete. |
|
||||
| `MKTT-WP-0006` | complete | done | `MKTT-WP-0004`; task-level trigger: `MKTT-WP-0003-T005` | Optional backend fabric is complete: manifests, capabilities, snapshot identity, interfaces, registry, provenance, and read-only CLI scaffolding. |
|
||||
| `MKTT-WP-0010` | complete | done | `MKTT-WP-0004`; task-level trigger: `MKTT-WP-0003-T006` | Content references, processors, explode/implode, weave/tangle, content classes, and migration examples are complete as the first WP-0010 extension layer. |
|
||||
| `MKTT-WP-0007` | P2 | todo | `MKTT-WP-0006` | First practical cache backend use case: AST/JSONPath/SQLite/FTS. Preliminary refresh planning is in place as the performance contract. |
|
||||
| `MKTT-WP-0007` | P2 | todo | `MKTT-WP-0006` | First practical cache backend use case: AST/JSONPath/SQLite/FTS. SQLite snapshots, AST inspection, metadata indexing, and incremental refresh are in place; JSONPath, FTS, and cache-backed query remain. |
|
||||
| `MKTT-WP-0005` | P2 | todo | `MKTT-WP-0003`, `MKTT-WP-0004` | Pick up when generation/form/context or semantic assessment pressure appears. |
|
||||
| `MKTT-WP-0011` | P2 | todo | `MKTT-WP-0003`; task-level triggers: `MKTT-WP-0010-T001`, `MKTT-WP-0010-T005` | Declarative Markdown dataflow workflows: source extraction, deterministic/assisted processing, and multi-output generation. |
|
||||
| `MKTT-WP-0009` | P2 | todo | `MKTT-WP-0006` | Establish access-control gateway before security-sensitive cache/context use. |
|
||||
|
||||
Reference in New Issue
Block a user