WP-0009 git-backed DecisionLog + per-space append authority (keystone backing); WP-0010 derived views (wikilinks, BackLinks, RecentChanges, AllPages/SiteMap); WP-0011 incremental union + equivalence index + I-2 verification; WP-0012 second adapter (git-IS-store) validating the contract on a new substrate. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
3.5 KiB
id, type, title, domain, repo, status, owner, topic_slug, created, updated, depends_on
| id | type | title | domain | repo | status | owner | topic_slug | created | updated | depends_on | ||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| SHARD-WP-0011 | workplan | incremental union maintenance + equivalence index + I-2 verification | whynot | shard-wiki | active | tegwick | whynot | 2026-06-15 | 2026-06-15 |
|
SHARD-WP-0011 — Incremental union + equivalence index
Goal
Replace direct, recompute-on-read resolution with the incremental-first derived tier from
CoreArchitectureBlueprint §8.7: change-driven delta maintenance of the union/indexes, an
indexed equivalence path (blocking/LSH, not O(N²)) with correct retraction/propagation
(review B-4), and the I-2 verification mechanism (digest + background consistency-checker).
Rebuild becomes a bounded fallback, not the operational path.
Non-goal: distributed maintenance; persisted on-disk index store (in-memory derived tier is fine for this slice). Per-tenant partitioning (I-13) is honoured structurally but multi-tenant deployment is later.
Context
- Spec: blueprint §8.7 (incremental, blocking/LSH, rebuild-as-fallback), §8.4; review B-4 + open items O-1/O-4.
- Builds on union resolution (SHARD-WP-0007) and the views (SHARD-WP-0010) it accelerates.
Equivalence index: blocking + verify
id: SHARD-WP-0011-T1
status: todo
priority: high
A candidate-generation (blocking) layer — normalized title/path buckets + MinHash/LSH bands over content shingles — then verify (fingerprint / span-set overlap + curator bindings) to produce equivalence edges. Replaces pairwise O(N²). Tests: near-duplicates bucket together; unrelated pages don't; verified edges match a brute-force oracle on a small corpus.
Incremental maintenance (delta, not additive)
id: SHARD-WP-0011-T2
status: todo
priority: high
Change-driven delta updates: a changed/added/removed page re-buckets, then (per B-4) retracts edges it leaves, adds edges it enters, and propagates to equivalence neighbours (a retraction can split a chorus set). Drives union/BackLinks/RecentChanges deltas. Tests: add/edit/remove keep the index equal to a from-scratch rebuild; a bucket-exit retracts a stale edge.
I-2 verification: digest + consistency-checker
id: SHARD-WP-0011-T3
status: todo
priority: high
A per-partition Merkle-style digest of the derived tier, maintained alongside deltas, and a
background consistency-checker that recomputes a sampled fold and compares; mismatch →
scoped recompute of the affected region (self-healing). Makes derived = f(canonical)
verified, not asserted. Tests: induced drift is detected and repaired; digest stable under
equivalent event orders.
Wire incremental tier behind resolution + views
id: SHARD-WP-0011-T4
status: todo
priority: medium
Route UnionGraph.resolve and the SHARD-WP-0010 views through the maintained index (rebuild =
explicit fallback). Behaviour is unchanged from the consumer's view; only freshness/cost change.
Update SCOPE; pytest + pyflakes green.
Acceptance criteria
- Equivalence is indexed (blocking/LSH + verify), not pairwise; matches a brute-force oracle.
- Incremental maintenance (with retraction + propagation) keeps the derived tier equal to a from-scratch rebuild; rebuild is a bounded fallback.
- I-2 is verified by a digest + consistency-checker that detects and self-heals drift.
- Consumer-visible resolution/views behaviour unchanged;
pytest+ pyflakes green; synced.
Suggested task order
T1 equivalence index → T2 incremental maintenance → T3 I-2 verification → T4 wiring.