generated from coulomb/repo-seed
spec(SHARD-WP-0006 T4): incremental-equivalence correctness + I-2 verification (§8.7)
Fixes B-4. Incremental delta is not additive: a change processes bucket exits (retract unsupported edges) + entries (add) + propagation across equivalence neighbours, not just new candidates. Adds an I-2 verification mechanism: per-partition Merkle-style digest + background consistency-checker vs sampled fold → scoped self-healing recompute on drift. I-2 now eventually-verified, not asserted. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
@@ -645,14 +645,37 @@ comparison across all pages of all shards is O(N²) and is forbidden. Instead:
|
||||
≈O(N) candidates.
|
||||
2. **Verification** — candidate pairs are confirmed by full fingerprint / span-set overlap and
|
||||
any curator binding. Confirmed equivalences become union edges.
|
||||
3. **Incremental maintenance** — a changed page is re-bucketed and only its *new* candidate set
|
||||
is re-verified; equivalence is maintained per-change, never recomputed globally.
|
||||
3. **Incremental maintenance — the delta is *not* additive (review B-4).** A changed page may
|
||||
*leave* buckets as well as *enter* them, and leaving a bucket can **break an existing
|
||||
equivalence edge** another page relied on. So a change is processed as: (i) recompute the
|
||||
page's bucket membership; (ii) for buckets it **left**, re-verify the pairs that depended on
|
||||
the shared bucket and **retract** edges no longer supported; (iii) for buckets it **entered**,
|
||||
verify the new candidate pairs and **add** edges; (iv) **propagate** to the equivalence
|
||||
neighbours of any retracted/added edge (equivalence is transitive-ish via chorus sets, so a
|
||||
retraction can split a set). Maintenance is per-change and bounded by the page's
|
||||
neighbourhood, but it covers retraction and propagation — not just additions.
|
||||
|
||||
**The index is itself derived** (disposable, recomputable) and per-tenant-partitioned (§9).
|
||||
Its parameters (LSH band/row counts, shingle size, precision/recall) are tunable; the accepted
|
||||
**false-negative rate of blocking** is a known, tracked limitation (§12) — blocking trades a
|
||||
small miss rate for tractability, and curator bindings are the escape hatch for misses.
|
||||
|
||||
**Verifying I-2 (`derived = f(canonical)`) — eventually, not on faith (review B-4).**
|
||||
Incremental maintenance can drift from a from-scratch fold over time (a missed retraction, a
|
||||
dropped event, a bug). I-2 is therefore an **eventually-verified** property, not a free one,
|
||||
and the architecture names the mechanism that verifies it:
|
||||
|
||||
- **A digest of the derived tier.** Each partition's derived tier carries a rolling content
|
||||
digest (a Merkle-style hash over union nodes/edges/index entries) maintained alongside the
|
||||
incremental updates.
|
||||
- **A background consistency-checker** periodically recomputes the digest over a *sampled* (or,
|
||||
on a slow cadence, full) fold of canonical state and compares. A mismatch localises the drift
|
||||
to a partition/region and triggers a **scoped recompute** of just that region — cheap relative
|
||||
to a global rebuild, and self-healing.
|
||||
- **So I-2 holds *eventually and verifiably*:** the incremental engine is the fast path, the
|
||||
checker is the guarantee, and divergence is detected and repaired rather than silently
|
||||
accumulating. The exact sampling rate / digest granularity is an implementation spike (§12).
|
||||
|
||||
### 8.8 Cache freshness & invalidation
|
||||
|
||||
Replication-projection caches remote shard content; cache invalidation is the actual hard part
|
||||
|
||||
@@ -94,7 +94,7 @@ Settle the keystone (review B-1 + B-3 together). Decide and document:
|
||||
|
||||
```task
|
||||
id: SHARD-WP-0006-T3
|
||||
status: todo
|
||||
status: done
|
||||
priority: high
|
||||
state_hub_task_id: "900c8234-ca73-4225-b2c5-77d218ded28c"
|
||||
```
|
||||
|
||||
Reference in New Issue
Block a user