Files
markitect-tool/docs/backend-fabric.md

152 lines
3.9 KiB
Markdown

# Optional Backend Fabric
Date: 2026-05-04
## Purpose
The backend fabric is the WP-0006 architecture layer for persistent snapshots,
indexes, query adapters, context packages, policy gateways, and provenance.
It is optional. The core parser, contracts, query engine, transforms, includes,
processors, templates, and generation commands keep working without any backend
manifest or persistent service.
## Capability Model
Backend manifests declare capabilities by name. The initial common vocabulary
is:
- `snapshots`
- `ast`
- `json`
- `jsonpath`
- `fts`
- `sql`
- `vector`
- `hybrid`
- `context_packages`
- `policy`
- `policy_pushdown`
- `provenance`
- `reference_graph`
- `processor_results`
- `source_maps`
Unknown capabilities are preserved in manifest metadata as extension hints, but
compatibility checks only reason over declared names.
## Manifests
Backends can be declared as YAML files or Markdown files with a
`markitect-backend` fenced YAML block:
````markdown
```yaml markitect-backend
id: local-sqlite-cache
kind: cache-backend
capabilities:
- snapshots
- json
- fts
- provenance
storage:
engine: sqlite
path: .markitect/cache/index.sqlite
policy:
mode: labels
```
````
The loader reads manifests only. It does not import optional dependencies or
open a database.
## Snapshot Identity
Snapshot identity is content addressed and includes:
- source path
- source content hash
- parser id
- parser version
- parse options hash
- optional contract hash
The resulting `snapshot_id` is a stable hash over those identity fields. This
lets future AST, JSONPath, FTS, SQL, vector, policy, and context-package
backends invalidate derived data without guessing what changed.
## Refresh Planning
Before WP-0007 writes a local SQLite index, the backend fabric provides a
read-only refresh planner. The planner compares current Markdown files with a
portable snapshot-state inventory and reports:
- unchanged files
- files that need hashing
- files that need parsing
- files that need indexing
- files that only need metadata updates
- deleted sources
- dependency-invalidated dependents
The planner uses a cheap-first strategy:
1. Compare path, size, mtime, parser version, parse options hash, and contract
hash.
2. If cheap metadata is unchanged, skip hashing, parsing, and indexing.
3. If metadata changed, either mark the file for hash/parse/index or, with
`--verify-hashes`, hash only those changed candidates to avoid parsing when
content is unchanged.
4. Use dependency edges to invalidate direct and transitive dependents.
This gives WP-0007 a performance contract before the storage engine exists.
```bash
mkt backend refresh-plan docs --state examples/backend-state/snapshot-state.yaml
mkt backend refresh-plan docs --state .markitect/cache/snapshots.yaml --verify-hashes
```
## Provenance Envelope
The shared backend provenance envelope records:
- operation
- snapshot id
- source path
- content hash
- dependency edges
- backend id
- policy decision id
- extension metadata
This complements the operation-level provenance added in WP-0010 and gives
future snapshot/query/context/policy results a common metadata shape.
## Interfaces
Protocol interfaces are provided for:
- `SnapshotBackend`
- `IndexBackend`
- `QueryAdapter`
- `ContextPackageRegistry`
- `AccessPolicyGateway`
- `ProcessorResultStore`
These are contracts for future implementations. They are intentionally light
and do not force the current CLI through a persistent backend.
## CLI
Read-only inspection commands:
```bash
mkt backend list --path examples/backends
mkt backend inspect local-sqlite-cache --path examples/backends --require snapshots --require provenance
mkt backend snapshot-id docs/content-references.md
mkt backend refresh-plan docs --state examples/backend-state/snapshot-state.yaml
```
The existing `mkt cache status` remains the lightweight file-manifest change
detector. Backend manifests are a separate optional fabric.