Refresh planning layer for backend fabric

This commit is contained in:
2026-05-04 03:25:26 +02:00
parent 3f08a27a24
commit b1577d90db
10 changed files with 797 additions and 2 deletions

View File

@@ -28,6 +28,25 @@ This backend should later be able to index `MKTT-WP-0010` references, named
regions, chunks, and processor provenance without changing its basic storage
contract.
## Preliminary Refinement - Snapshot Refresh Planning
Implemented before starting the SQLite/index tasks: `SnapshotState`,
`SnapshotPlanEntry`, `SnapshotRefreshPlan`, `plan_snapshot_refresh`,
`load_snapshot_state_file`, and CLI `mkt backend refresh-plan`.
This is the performance contract for WP-0007:
- compare cheap metadata before hashing
- hash only likely-changed files when `--verify-hashes` is requested
- parse only files whose identity/content requires a new snapshot
- index only new, changed, unindexed, or dependency-invalidated entries
- carry direct and transitive dependency invalidation forward from
`DependencyEdge`
- keep refresh planning inspectable through JSON/YAML/text output
The future SQLite store should persist enough state to feed this planner
directly and should report actual refresh work against the same categories.
## P7.1 - Implement local snapshot store
```task
@@ -40,6 +59,14 @@ state_hub_task_id: "8894a9a4-586c-457b-b4e6-add8276ff5f2"
Persist parsed document snapshots and source metadata in a local cache
directory.
Implementation hints:
- Persist `SnapshotState` fields in the snapshot/source tables.
- Store path, size, mtime, content hash, parser id/version, parse options hash,
contract hash, snapshot id, indexed flag, and dependency edges.
- Keep large document/token JSON lazy-loadable so refresh planning does not
pull whole AST payloads into memory.
## P7.2 - Add AST introspection commands
```task
@@ -86,6 +113,14 @@ and metrics in SQLite.
Keep schema extension points for reference edges, named regions, chunks, and
processor outputs.
Implementation hints:
- Use narrow metadata tables for hot refresh decisions.
- Store document/token JSON separately from searchable section/block rows.
- Add indexes on path, content hash, snapshot id, parser version, and unit ids.
- Preserve source spans and content-unit ids from WP-0010 reference/literate
layers.
## P7.5 - Add FTS5 section/block search
```task
@@ -111,6 +146,16 @@ Refresh only changed files based on content hash and parser version.
Include dependency invalidation hooks for future transclusion/reference graphs.
Implementation hints:
- Drive incremental refresh from `SnapshotRefreshPlan`.
- The first pass should use cheap metadata; only hash metadata-changed files.
- With `--verify-hashes`, skip parse/index when content is unchanged and only
update metadata.
- Use reverse dependency edges for direct and transitive invalidation.
- Report planned vs actual counts for hash, parse, index, metadata update,
delete, and invalidation work.
## P7.7 - Add local index CLI
```task