A Merkle-style digest summarizes the derived tier (per-identity fingerprint +
incident edges as order-independent leaves) so equal states have equal digests
and the digest is stable under equivalent event orders. A ConsistencyChecker
recomputes the authoritative fold from the current source, compares it over a
sampled region, and on mismatch scoped-recomputes just the affected identities —
self-healing missed-delta drift, corrupted internal state, and vanished pages.
Makes derived = f(canonical) verified, not asserted.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Detect equivalence (distinct identities holding the same page) without pairwise
O(N²): MinHash/LSH bands over content shingles + normalized-title buckets
generate candidates (blocking), then exact-fingerprint or Jaccard>=threshold
confirm them (verify), with curator decision-log bindings always forming edges.
Groups are the connected components of the edge set. Includes the incremental
add/update/remove internals used by T2. Matches a brute-force oracle. New
incremental/ package (minhash primitives + EquivalenceIndex).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>