diff --git a/spec/CoreArchitectureBlueprint.md b/spec/CoreArchitectureBlueprint.md index e9ee2d0..8d65066 100644 --- a/spec/CoreArchitectureBlueprint.md +++ b/spec/CoreArchitectureBlueprint.md @@ -360,6 +360,24 @@ break every reference to it (review bug B-1). They are pulled apart here: So the chain is: **identity (stable) → placements (N, mutable) → equivalence (cross-identity sameness, fingerprint-based)** — three concepts, three mechanisms, never conflated. +### 7.3 Provenance is layered, not per-span-duplicated + +A provenance envelope on *every span* (source shard, freshness, liveness, overlay status, +authz context, divergence, lineage) would, at block granularity, mean ~10k near-identical +envelopes for a 10k-block page — provenance dwarfing content (review D-2). The fix is the exact +pattern the page model already uses for Trilium's computed metadata: **effective-vs-own**. + +- **Page-level envelope** holds the values that are uniform across the page (almost always: + source shard, observed-at, liveness, authz context). +- **Span-level deltas** record *only where a span differs* from its page envelope — a + transcluded span from another shard, an overlaid span, a span that diverges. A span with no + delta inherits the page envelope at zero storage cost. +- **Effective provenance** for any span = page envelope ⊕ span delta, computed on read. + +Per-span cost is therefore **near-zero in the common (uniform) case** and pays only for genuine +heterogeneity — the same "carry only the difference" principle, applied to shard-wiki's own +metadata. Provenance remains complete (I-4); it is just not redundantly materialised. + --- ## 8. Coordination, federation & projection @@ -411,18 +429,24 @@ derived**; recompute reads them, never regenerates them. It comprises: - **Transclusion** — one **reference-not-copy** primitive unifying Xanadu transclusion, ZigZag clone, Roam/Obsidian/Logseq embed, Notion synced block, Trilium note-cloning, and literate named-chunk assembly, over the addressable union. -- **Projection — the two-axis model:** - - *Kind:* **replication-projection** (lazy cache of remote content — the default) vs - **derivation-projection** (transform/compile/weave/evaluate a source). - - *Liveness:* static → captured snapshot → live-over-files → view-time → irreducibly-live. - - Derivation facets: materialization timing (ahead-of-time vs view-time), multiplicity (one - output vs N co-equal), continuity (one-shot vs continuous). Every projection declares its - liveness + freshness + provenance; the irreducibly-live far end has no faithful static - form (source + a marked recording). -- **Moldable view registry** — projection generalises to an **open, type-keyed set of - co-equal, possibly-computed views, none canonical-by-fact** (display-canonical is policy). - This unifies replication/derivation/dimensional/query projection and answers the "pluggable - content-type registry" question (GT prior art). +- **Projection — trivial by default, extensible for the tail.** The 95% case (Markdown in a + shard) must cost nothing conceptually, so: + - **Default = plain lazy replication-projection** — a freshness-stamped cache of remote + content (§8.8). This is *the* projection for ordinary pages; it needs no taxonomy, no + liveness reasoning, no registry. Most shards never touch anything below. + - **Extension point — derivation-projection** — invoked *only* for content that is a + *source* needing transform/compile/weave/evaluate (computational/typed content, §8.5). It + adds the liveness axis (static → captured → live-over-files → view-time → irreducibly-live) + and facets (materialization timing, multiplicity, continuity); the irreducibly-live far end + has no faithful static form (source + a marked recording). A binding that never serves such + content never instantiates any of this. + - Both kinds stamp freshness + provenance; only derivation carries the liveness machinery. +- **Moldable view registry — also an extension point, not a tax on every page.** Where a content + type offers multiple co-equal views (typed/computed/dimensional content), they are registered + as an **open, type-keyed set, none canonical-by-fact** (display-canonical is policy; GT prior + art, answers the "pluggable content-type registry" question). An ordinary Markdown page has + exactly one view and never consults the registry — the registry is queried only when a type + declares >1 view. - **Derived query index** — delegate to a shard's native query engine where present (Roam/Logseq Datalog, Notion DB query, XWiki XWQL, Wikibase SPARQL); else build a derived index over the projection (the Logseq DataScript-over-files pattern). The index is @@ -631,13 +655,27 @@ src/shard_wiki/ projection/ # L4 (derived): ReplicationProjection, DerivationProjection, # ViewRegistry (moldable), QueryIndex (delegate|derive) authz/ # L5 cross-cut: PDP, PEP, IdentityProvider iface, NullProvider - provenance/ # cross-cut: the envelope plumbing used by every layer + provenance/ # cross-cut LEAF: ProvenanceEnvelope type + ⊕ (effective) only — pure data + policy/ # cross-cut LEAF: the §10 policy surface (presets + a resolve() read by + # coordination/federation/projection/authz); owns NO mechanism api/ # L6: orchestrator API (server-side union for agents/CLI) ``` -Hard import rules: `union/` and `projection/` may import `model/`, `adapters/`, -`coordination/` but **nothing may import them** (they are the disposable middle). `model/` and -`adapters/` import nothing else in the tree except `provenance/` (the waists stay thin). +**The cross-cutting rails are leaves, not god-modules (review D-4).** `provenance/` and +`policy/` are imported widely, so they are the highest coupling risk; the discipline that caps +it is: **they may import *nothing* in the tree and contain *only* stable data types + pure +functions** (the envelope and its `⊕`; the policy presets and a `resolve(question) → choice`). +Mechanism never lives in a rail — `policy/` says *what* the preset is, `coordination/`/ +`projection/` decide *how* to honour it. A change to a rail is then a change to a small, stable, +dependency-free leaf, not a ripple through every layer. Capability-spectrum value types live in +`model/` (also leaf-like) for the same reason. + +Hard import rules (enforced by import lint): +- `union/` and `projection/` may import `model/`, `adapters/`, `coordination/`, `policy/`, + `provenance/` — but **nothing may import them** (they are the disposable derived tier). +- `model/`, `adapters/`, `provenance/`, `policy/` import nothing else in the tree (the waists + and rails stay thin); `provenance/` and `policy/` import nothing at all. +- `coordination/` and `federation/` may import the waists + rails, never the derived tier. --- diff --git a/workplans/SHARD-WP-0005-architecture-hardening.md b/workplans/SHARD-WP-0005-architecture-hardening.md index c4cf1bd..a9bdd5f 100644 --- a/workplans/SHARD-WP-0005-architecture-hardening.md +++ b/workplans/SHARD-WP-0005-architecture-hardening.md @@ -146,7 +146,7 @@ operational-envelope axis (rate-limited shards favour event-driven + long TTL). ```task id: SHARD-WP-0005-T6 -status: todo +status: done priority: medium state_hub_task_id: "f04ce101-0d95-4e1a-ab8b-80dfff9d2dda" ```