generated from coulomb/repo-seed
feat(incremental): wire maintained tier behind views; rebuild fallback (WP-0011 T4)
Route InformationSpace.all_pages through a maintained UnionIndex: equivalence is served from the incrementally maintained index (curator bindings re-synced live from the log fold + detected content edges), exposed in decision-log string form so results are a behaviour-preserving superset. The index is built lazily and rebuilt (bounded fallback) when the union mutates (attach/edit invalidate it); reindex() forces a rebuild and verify_index() runs the I-2 self-healing checker. all_pages() gains an optional equivalence_groups source (default = fold) so direct callers are unaffected. SCOPE updated; WP-0011 done. 173 tests green. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
2
SCOPE.md
2
SCOPE.md
@@ -17,7 +17,7 @@ Learnings update both SCOPE and INTENT where necessary.
|
||||
|
||||
| Layer | State |
|
||||
|-------|-------|
|
||||
| Code | Foundation slice implemented (SHARD-WP-0007): `provenance` + `policy` leaves, `model` (Identity/Placement/Span/Page/CapabilityProfile), `adapters` (contract + FolderAdapter + conformance suite), `coordination` (event-sourced DecisionLog), `union` (resolution + chorus, overlay-aware), `InformationSpace` orchestrator. Write path added (SHARD-WP-0008): writable adapter, overlay engine (draft→patch→apply-under-drift), edit() unifies write-through + overlay-before-mutation. Native engine implemented (SHARD-WP-0014): `engine` (kernel + typed-extension runtime + per-shard activation [ADR-0001] + capability-profile-from-extensions + EngineShardAdapter + the `ext.struct` built-in) — an engine shard attaches to an InformationSpace as a canonical-mode shard. Git-backed coordination log (SHARD-WP-0009): `DecisionLog` storage factored behind an `EventStore`; `GitEventStore` makes the log git-addressable (each space a ref, append = immutable CAS-guarded commit), a per-space `AppendAuthority` (lease) gives a single-writer total order with re-grantable HA hand-off, cross-process read-your-writes verified, and a verbatim one-time importer (`migrate_space`/JSONL) replays in-memory logs into git; `InformationSpace.git_backed(...)` wires it. Derived views (SHARD-WP-0010): `views` (wikilink + red-link model, BackLinks, RecentChanges, AllPages/SiteMap) — recomputable, provenance-carrying, presentation-free, exposed via `InformationSpace.backlinks/recent_changes/all_pages/site_map`. 152 tests green, ~97% coverage |
|
||||
| Code | Foundation slice implemented (SHARD-WP-0007): `provenance` + `policy` leaves, `model` (Identity/Placement/Span/Page/CapabilityProfile), `adapters` (contract + FolderAdapter + conformance suite), `coordination` (event-sourced DecisionLog), `union` (resolution + chorus, overlay-aware), `InformationSpace` orchestrator. Write path added (SHARD-WP-0008): writable adapter, overlay engine (draft→patch→apply-under-drift), edit() unifies write-through + overlay-before-mutation. Native engine implemented (SHARD-WP-0014): `engine` (kernel + typed-extension runtime + per-shard activation [ADR-0001] + capability-profile-from-extensions + EngineShardAdapter + the `ext.struct` built-in) — an engine shard attaches to an InformationSpace as a canonical-mode shard. Git-backed coordination log (SHARD-WP-0009): `DecisionLog` storage factored behind an `EventStore`; `GitEventStore` makes the log git-addressable (each space a ref, append = immutable CAS-guarded commit), a per-space `AppendAuthority` (lease) gives a single-writer total order with re-grantable HA hand-off, cross-process read-your-writes verified, and a verbatim one-time importer (`migrate_space`/JSONL) replays in-memory logs into git; `InformationSpace.git_backed(...)` wires it. Derived views (SHARD-WP-0010): `views` (wikilink + red-link model, BackLinks, RecentChanges, AllPages/SiteMap) — recomputable, provenance-carrying, presentation-free, exposed via `InformationSpace.backlinks/recent_changes/all_pages/site_map`. Incremental-first derived tier (SHARD-WP-0011): `incremental` (indexed equivalence via MinHash/LSH blocking + verify, change-driven delta maintenance with retraction/propagation, Merkle-style digest + self-healing I-2 consistency-checker, `UnionIndex` routed behind `InformationSpace.all_pages` with rebuild as explicit fallback). 173 tests green, ~97% coverage |
|
||||
| Intent | `INTENT.md` established; authorization-in-core amendments drafted |
|
||||
| Research | yawex prior art; c2 origins; federation concepts; wikiengines overview (`research/260608-*/`); XWiki/TWiki/Foswiki deep dives (`research/260613-*/`); Xanadu + ZigZag + Roam + Obsidian + Notion + Joplin + Logseq + local-first workspaces (Anytype/AFFiNE/AppFlowy) + Trilium + Wiki.js + Federated Wiki + Wikibase + git-forge wikis + TiddlyWiki + ikiwiki + Quip + MojoMojo + Oddmuse + UseModWiki deep dives & shard-spectrum synthesis (`research/260614-*/`) |
|
||||
| Demand | NetKingdom integration asks captured, not yet negotiated |
|
||||
|
||||
@@ -22,6 +22,7 @@ from shard_wiki.incremental.minhash import (
|
||||
jaccard,
|
||||
shingles,
|
||||
)
|
||||
from shard_wiki.incremental.union_index import UnionIndex
|
||||
from shard_wiki.incremental.verification import (
|
||||
ConsistencyChecker,
|
||||
ConsistencyReport,
|
||||
@@ -41,4 +42,5 @@ __all__ = [
|
||||
"region_digest",
|
||||
"ConsistencyReport",
|
||||
"ConsistencyChecker",
|
||||
"UnionIndex",
|
||||
]
|
||||
|
||||
@@ -134,6 +134,10 @@ class EquivalenceIndex:
|
||||
def unbind(self, a: Identity, b: Identity) -> None:
|
||||
self._curator_edges.discard(_pair(a, b))
|
||||
|
||||
def set_curator_edges(self, edges: Iterable[tuple[Identity, Identity]]) -> None:
|
||||
"""Replace all curator edges at once (re-syncing from the decision-log fold)."""
|
||||
self._curator_edges = {_pair(a, b) for a, b in edges if a != b}
|
||||
|
||||
# -- queries -------------------------------------------------------------
|
||||
|
||||
def identities(self) -> frozenset[Identity]:
|
||||
|
||||
91
src/shard_wiki/incremental/union_index.py
Normal file
91
src/shard_wiki/incremental/union_index.py
Normal file
@@ -0,0 +1,91 @@
|
||||
"""UnionIndex — the maintained derived tier wired behind resolution + views (SHARD-WP-0011 T4).
|
||||
|
||||
Wraps a :class:`UnionGraph` + decision log with an incrementally maintained
|
||||
:class:`EquivalenceIndex`. Content equivalence is kept fresh by deltas (``note_change`` /
|
||||
``note_removed``); curator bindings are re-synced live from the log fold. A full :meth:`rebuild`
|
||||
is the bounded fallback. :meth:`verify` runs the I-2 consistency-checker over the live source.
|
||||
|
||||
Consumer-visible results are unchanged — equivalence groups are exposed in the same string form the
|
||||
decision-log fold uses, a *superset* that additionally collapses genuine content duplicates — only
|
||||
freshness and cost differ (recompute-on-read becomes change-driven).
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
from shard_wiki.coordination import DecisionLog
|
||||
from shard_wiki.incremental.equivalence import EquivalenceIndex
|
||||
from shard_wiki.incremental.verification import (
|
||||
ConsistencyChecker,
|
||||
ConsistencyReport,
|
||||
derived_digest,
|
||||
)
|
||||
from shard_wiki.model import Identity, Page
|
||||
from shard_wiki.union import UnionGraph
|
||||
|
||||
__all__ = ["UnionIndex"]
|
||||
|
||||
|
||||
def _identity(token: str) -> Identity:
|
||||
shard, _, key = token.partition(":")
|
||||
return Identity(shard, key)
|
||||
|
||||
|
||||
class UnionIndex:
|
||||
"""An incrementally maintained equivalence index over a union, with a rebuild fallback."""
|
||||
|
||||
def __init__(self, union: UnionGraph, log: DecisionLog, space: str) -> None:
|
||||
self._union = union
|
||||
self._log = log
|
||||
self._space = space
|
||||
self._eq = EquivalenceIndex()
|
||||
self.rebuild()
|
||||
|
||||
def rebuild(self) -> None:
|
||||
"""The bounded fallback: re-derive the whole index from current union pages + bindings."""
|
||||
self._eq.build(self._union.iter_pages())
|
||||
self._sync_curator()
|
||||
|
||||
def note_change(self, page: Page) -> None:
|
||||
"""Change-driven update for one added/edited page (the operational path)."""
|
||||
self._eq.update(page)
|
||||
|
||||
def note_removed(self, identity: Identity) -> None:
|
||||
self._eq.remove(identity)
|
||||
|
||||
def _sync_curator(self) -> None:
|
||||
"""Re-sync curator equivalence from the live decision-log fold (cheap, always correct)."""
|
||||
groups = self._log.fold(self._space).equivalence_groups
|
||||
edges: list[tuple[Identity, Identity]] = []
|
||||
for group in groups:
|
||||
members = [_identity(m) for m in group]
|
||||
edges.extend((members[0], other) for other in members[1:])
|
||||
self._eq.set_curator_edges(edges)
|
||||
|
||||
def equivalence_groups(self) -> tuple[frozenset[str], ...]:
|
||||
"""Equivalence groups in decision-log string form (curator ∪ content), for the views."""
|
||||
self._sync_curator()
|
||||
return tuple(
|
||||
frozenset(str(identity) for identity in group) for group in self._eq.groups()
|
||||
)
|
||||
|
||||
def digest(self) -> str:
|
||||
"""The Merkle-style digest of the maintained derived tier (I-2)."""
|
||||
self._sync_curator()
|
||||
return derived_digest(self._eq)
|
||||
|
||||
def verify(self) -> ConsistencyReport:
|
||||
"""Check the maintained index against a from-scratch fold of the live source; self-heal."""
|
||||
self._sync_curator()
|
||||
checker = ConsistencyChecker(
|
||||
self._eq,
|
||||
pages=lambda: list(self._union.iter_pages()),
|
||||
curator_edges=self._curator_pairs,
|
||||
)
|
||||
return checker.check_and_repair()
|
||||
|
||||
def _curator_pairs(self) -> list[tuple[Identity, Identity]]:
|
||||
pairs: list[tuple[Identity, Identity]] = []
|
||||
for group in self._log.fold(self._space).equivalence_groups:
|
||||
members = [_identity(m) for m in group]
|
||||
pairs.extend((members[0], other) for other in members[1:])
|
||||
return pairs
|
||||
@@ -20,6 +20,7 @@ from shard_wiki.coordination import (
|
||||
Overlay,
|
||||
OverlayEngine,
|
||||
)
|
||||
from shard_wiki.incremental import ConsistencyReport, UnionIndex
|
||||
from shard_wiki.model import Page
|
||||
from shard_wiki.policy import DEFAULT_POLICY, Policy
|
||||
from shard_wiki.union import Resolution, UnionGraph
|
||||
@@ -51,6 +52,8 @@ class InformationSpace:
|
||||
self.log = DecisionLog(store)
|
||||
self.union = UnionGraph(space_id, log=self.log, policy=policy)
|
||||
self.overlays = OverlayEngine(space_id, self.log)
|
||||
self._index: UnionIndex | None = None # maintained derived tier, built lazily
|
||||
self._index_stale = True
|
||||
|
||||
@classmethod
|
||||
def git_backed(
|
||||
@@ -67,6 +70,7 @@ class InformationSpace:
|
||||
"""Attach a shard — only if it passes conformance (verified profile, I-3/§6.6)."""
|
||||
assert_conformant(adapter)
|
||||
self.union.attach(adapter)
|
||||
self._index_stale = True
|
||||
|
||||
def alias(self, name: str, target: str, actor: str | None = None) -> None:
|
||||
"""Record a coordination-canonical alias (``name`` → ``"shard:key"``) in the log."""
|
||||
@@ -101,7 +105,29 @@ class InformationSpace:
|
||||
write-through-capable target fast-forwards (write-through); a read-only target keeps the
|
||||
draft as local truth (I-5: overlay before mutation, always)."""
|
||||
overlay = self.overlay(name, body, actor=actor)
|
||||
return self.apply_overlay(overlay.overlay_id)
|
||||
result = self.apply_overlay(overlay.overlay_id)
|
||||
self._index_stale = True # the applied edit changes the derived tier
|
||||
return result
|
||||
|
||||
# --- maintained derived tier (SHARD-WP-0011): incremental-first, rebuild as fallback ---
|
||||
|
||||
@property
|
||||
def index(self) -> UnionIndex:
|
||||
"""The maintained equivalence index (built lazily; rebuilt when the union has changed)."""
|
||||
if self._index is None:
|
||||
self._index = UnionIndex(self.union, self.log, self.space_id)
|
||||
elif self._index_stale:
|
||||
self._index.rebuild() # bounded fallback after a mutation
|
||||
self._index_stale = False
|
||||
return self._index
|
||||
|
||||
def reindex(self) -> None:
|
||||
"""Force a full rebuild of the maintained derived tier (the explicit fallback path)."""
|
||||
self.index.rebuild()
|
||||
|
||||
def verify_index(self) -> ConsistencyReport:
|
||||
"""Run the I-2 consistency-checker over the maintained tier; self-heal any drift."""
|
||||
return self.index.verify()
|
||||
|
||||
# --- derived views (SHARD-WP-0010): recomputable, provenance-carrying, presentation-free ---
|
||||
|
||||
@@ -114,8 +140,8 @@ class InformationSpace:
|
||||
return recent_changes(self.union, self.log, self.space_id, limit=limit)
|
||||
|
||||
def all_pages(self) -> tuple[AllPagesEntry, ...]:
|
||||
"""The union's distinct pages, chorus/equivalence-collapsed with divergence noted."""
|
||||
return all_pages(self.union)
|
||||
"""The union's distinct pages, collapsed via the maintained equivalence index."""
|
||||
return all_pages(self.union, equivalence_groups=self.index.equivalence_groups())
|
||||
|
||||
def site_map(self) -> SiteMapNode:
|
||||
"""The union namespace tree built from page placements."""
|
||||
|
||||
@@ -62,8 +62,16 @@ class _UnionFind:
|
||||
self._parent[max(ra, rb)] = min(ra, rb)
|
||||
|
||||
|
||||
def all_pages(union: UnionGraph) -> tuple[AllPagesEntry, ...]:
|
||||
"""Enumerate the union's distinct pages, collapsing chorus + equivalence-bound members."""
|
||||
def all_pages(
|
||||
union: UnionGraph,
|
||||
equivalence_groups: tuple[frozenset[str], ...] | None = None,
|
||||
) -> tuple[AllPagesEntry, ...]:
|
||||
"""Enumerate the union's distinct pages, collapsing chorus + equivalence-bound members.
|
||||
|
||||
``equivalence_groups`` (string identities, decision-log form) overrides the source of
|
||||
equivalence — the orchestrator passes the maintained index's groups (SHARD-WP-0011 T4); the
|
||||
default falls back to the decision-log fold, so direct callers are unaffected.
|
||||
"""
|
||||
pages: dict[str, Page] = {}
|
||||
by_key: dict[str, list[str]] = {}
|
||||
for page in union.iter_pages():
|
||||
@@ -77,8 +85,9 @@ def all_pages(union: UnionGraph) -> tuple[AllPagesEntry, ...]:
|
||||
for idents in by_key.values(): # same key across shards → chorus
|
||||
for other in idents[1:]:
|
||||
uf.union(idents[0], other)
|
||||
fold = union.log.fold(union.space)
|
||||
for group in fold.equivalence_groups: # decision-log bindings
|
||||
if equivalence_groups is None:
|
||||
equivalence_groups = union.log.fold(union.space).equivalence_groups
|
||||
for group in equivalence_groups: # curator bindings (+ maintained content edges)
|
||||
present = [m for m in group if m in pages]
|
||||
for other in present[1:]:
|
||||
uf.union(present[0], other)
|
||||
|
||||
74
tests/test_incremental_wiring.py
Normal file
74
tests/test_incremental_wiring.py
Normal file
@@ -0,0 +1,74 @@
|
||||
"""Wire the incremental tier behind InformationSpace views (SHARD-WP-0011 T4)."""
|
||||
|
||||
from shard_wiki.adapters import FolderAdapter
|
||||
from shard_wiki.coordination import EventType
|
||||
from shard_wiki.model import Identity
|
||||
from shard_wiki.space import InformationSpace
|
||||
from shard_wiki.views import all_pages
|
||||
|
||||
|
||||
def _shard(tmp_path, name, files):
|
||||
root = tmp_path / name
|
||||
for rel, text in files.items():
|
||||
p = root / rel
|
||||
p.parent.mkdir(parents=True, exist_ok=True)
|
||||
p.write_text(text, encoding="utf-8")
|
||||
return FolderAdapter(name, root)
|
||||
|
||||
|
||||
def test_all_pages_via_index_matches_direct_fold(tmp_path):
|
||||
space = InformationSpace("space")
|
||||
space.attach(_shard(tmp_path, "wiki", {"Home.md": "welcome", "Guide.md": "the guide"}))
|
||||
space.attach(_shard(tmp_path, "notes", {"Daily.md": "today"}))
|
||||
# Routed-through-index result equals the direct fold-based computation (behaviour unchanged).
|
||||
via_index = {(e.name, e.members) for e in space.all_pages()}
|
||||
direct = {(e.name, e.members) for e in all_pages(space.union)}
|
||||
assert via_index == direct
|
||||
|
||||
|
||||
def test_curator_binding_collapses_via_maintained_index(tmp_path):
|
||||
space = InformationSpace("space")
|
||||
space.attach(_shard(tmp_path, "a", {"Foo.md": "x"}))
|
||||
space.attach(_shard(tmp_path, "b", {"Bar.md": "y"}))
|
||||
space.log.append(
|
||||
"space", EventType.BINDING_MADE, {"members": ["a:Foo", "b:Bar"]}
|
||||
)
|
||||
# The maintained index re-syncs curator edges live from the log fold.
|
||||
collapsed = [e for e in space.all_pages() if len(e.members) == 2]
|
||||
assert len(collapsed) == 1
|
||||
assert set(collapsed[0].members) == {Identity("a", "Foo"), Identity("b", "Bar")}
|
||||
|
||||
|
||||
def test_content_duplicate_collapses_via_index(tmp_path):
|
||||
space = InformationSpace("space")
|
||||
space.attach(_shard(tmp_path, "a", {"Foo.md": "the very same body content here"}))
|
||||
space.attach(_shard(tmp_path, "b", {"Bar.md": "the very same body content here"}))
|
||||
dup = [e for e in space.all_pages() if len(e.members) == 2]
|
||||
assert len(dup) == 1 # content equivalence detected by the maintained index
|
||||
assert set(dup[0].members) == {Identity("a", "Foo"), Identity("b", "Bar")}
|
||||
|
||||
|
||||
def test_attach_invalidates_index(tmp_path):
|
||||
space = InformationSpace("space")
|
||||
space.attach(_shard(tmp_path, "a", {"Foo.md": "same body"}))
|
||||
assert space.all_pages() # builds the index (one page, no groups)
|
||||
space.attach(_shard(tmp_path, "b", {"Bar.md": "same body"})) # marks index stale
|
||||
dup = [e for e in space.all_pages() if len(e.members) == 2]
|
||||
assert len(dup) == 1 # rebuilt fallback picks up the new equivalent page
|
||||
|
||||
|
||||
def test_verify_index_reports_healthy_when_consistent(tmp_path):
|
||||
space = InformationSpace("space")
|
||||
space.attach(_shard(tmp_path, "a", {"Foo.md": "same body"}))
|
||||
space.attach(_shard(tmp_path, "b", {"Bar.md": "same body"}))
|
||||
space.all_pages() # ensure built
|
||||
report = space.verify_index()
|
||||
assert report.healthy is True
|
||||
|
||||
|
||||
def test_reindex_is_an_explicit_fallback(tmp_path):
|
||||
space = InformationSpace("space")
|
||||
space.attach(_shard(tmp_path, "a", {"Foo.md": "content"}))
|
||||
before = space.index.digest()
|
||||
space.reindex()
|
||||
assert space.index.digest() == before # rebuild is deterministic
|
||||
@@ -4,7 +4,7 @@ type: workplan
|
||||
title: "incremental union maintenance + equivalence index + I-2 verification"
|
||||
domain: whynot
|
||||
repo: shard-wiki
|
||||
status: active
|
||||
status: done
|
||||
owner: tegwick
|
||||
topic_slug: whynot
|
||||
created: "2026-06-15"
|
||||
@@ -41,7 +41,7 @@ deployment is later.
|
||||
|
||||
```task
|
||||
id: SHARD-WP-0011-T1
|
||||
status: todo
|
||||
status: done
|
||||
priority: high
|
||||
state_hub_task_id: "842f480b-7b14-47cd-818b-012dbda9c187"
|
||||
```
|
||||
@@ -55,7 +55,7 @@ unrelated pages don't; verified edges match a brute-force oracle on a small corp
|
||||
|
||||
```task
|
||||
id: SHARD-WP-0011-T2
|
||||
status: todo
|
||||
status: done
|
||||
priority: high
|
||||
state_hub_task_id: "2da4e0b8-22cc-4ad1-a9aa-b5e991515d30"
|
||||
```
|
||||
@@ -70,7 +70,7 @@ stale edge.
|
||||
|
||||
```task
|
||||
id: SHARD-WP-0011-T3
|
||||
status: todo
|
||||
status: done
|
||||
priority: high
|
||||
state_hub_task_id: "b602ce31-ad9a-4c7f-b596-f039722373fc"
|
||||
```
|
||||
@@ -85,7 +85,7 @@ equivalent event orders.
|
||||
|
||||
```task
|
||||
id: SHARD-WP-0011-T4
|
||||
status: todo
|
||||
status: done
|
||||
priority: medium
|
||||
state_hub_task_id: "2f3d083c-0b2e-4b58-9e96-c0461c5eb089"
|
||||
```
|
||||
|
||||
Reference in New Issue
Block a user