spec(SHARD-WP-0006 T4): incremental-equivalence correctness + I-2 verification (§8.7)

Fixes B-4. Incremental delta is not additive: a change processes bucket
exits (retract unsupported edges) + entries (add) + propagation across
equivalence neighbours, not just new candidates. Adds an I-2 verification
mechanism: per-partition Merkle-style digest + background consistency-checker
vs sampled fold → scoped self-healing recompute on drift. I-2 now
eventually-verified, not asserted.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
2026-06-15 02:02:33 +02:00
parent 3a753a6f3b
commit 1ad70a9c8a
2 changed files with 26 additions and 3 deletions

View File

@@ -645,14 +645,37 @@ comparison across all pages of all shards is O(N²) and is forbidden. Instead:
≈O(N) candidates.
2. **Verification** — candidate pairs are confirmed by full fingerprint / span-set overlap and
any curator binding. Confirmed equivalences become union edges.
3. **Incremental maintenance**a changed page is re-bucketed and only its *new* candidate set
is re-verified; equivalence is maintained per-change, never recomputed globally.
3. **Incremental maintenance — the delta is *not* additive (review B-4).** A changed page may
*leave* buckets as well as *enter* them, and leaving a bucket can **break an existing
equivalence edge** another page relied on. So a change is processed as: (i) recompute the
page's bucket membership; (ii) for buckets it **left**, re-verify the pairs that depended on
the shared bucket and **retract** edges no longer supported; (iii) for buckets it **entered**,
verify the new candidate pairs and **add** edges; (iv) **propagate** to the equivalence
neighbours of any retracted/added edge (equivalence is transitive-ish via chorus sets, so a
retraction can split a set). Maintenance is per-change and bounded by the page's
neighbourhood, but it covers retraction and propagation — not just additions.
**The index is itself derived** (disposable, recomputable) and per-tenant-partitioned (§9).
Its parameters (LSH band/row counts, shingle size, precision/recall) are tunable; the accepted
**false-negative rate of blocking** is a known, tracked limitation (§12) — blocking trades a
small miss rate for tractability, and curator bindings are the escape hatch for misses.
**Verifying I-2 (`derived = f(canonical)`) — eventually, not on faith (review B-4).**
Incremental maintenance can drift from a from-scratch fold over time (a missed retraction, a
dropped event, a bug). I-2 is therefore an **eventually-verified** property, not a free one,
and the architecture names the mechanism that verifies it:
- **A digest of the derived tier.** Each partition's derived tier carries a rolling content
digest (a Merkle-style hash over union nodes/edges/index entries) maintained alongside the
incremental updates.
- **A background consistency-checker** periodically recomputes the digest over a *sampled* (or,
on a slow cadence, full) fold of canonical state and compares. A mismatch localises the drift
to a partition/region and triggers a **scoped recompute** of just that region — cheap relative
to a global rebuild, and self-healing.
- **So I-2 holds *eventually and verifiably*:** the incremental engine is the fast path, the
checker is the guarantee, and divergence is detected and repaired rather than silently
accumulating. The exact sampling rate / digest granularity is an implementation spike (§12).
### 8.8 Cache freshness & invalidation
Replication-projection caches remote shard content; cache invalidation is the actual hard part

View File

@@ -94,7 +94,7 @@ Settle the keystone (review B-1 + B-3 together). Decide and document:
```task
id: SHARD-WP-0006-T3
status: todo
status: done
priority: high
state_hub_task_id: "900c8234-ca73-4225-b2c5-77d218ded28c"
```