generated from coulomb/repo-seed
Records the critical review (history/260615-...) and establishes SHARD-WP-0005 to fold its findings (A-1, B-1..B-3, C-1..C-3, D-1..D-4) into the blueprint: correctness (state re-frame, identity/equivalence split, consistency model), scale (incremental-first union, equivalence indexing, cache invalidation), elegance (orthogonal spectra, layered provenance, common-case projection, policy module), security/history scaling, and a known-open-problems section. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
123 lines
6.7 KiB
Markdown
123 lines
6.7 KiB
Markdown
# Critical review — CoreArchitectureBlueprint.md
|
||
|
||
Date: 2026-06-15 · Reviewer: tegwick (with Claude) · Subject:
|
||
`spec/CoreArchitectureBlueprint.md` @ commit **9b5b393** · Feeds: **SHARD-WP-0005**
|
||
|
||
A deliberately hostile review of the first whole-system architecture, to find where it
|
||
**breaks (correctness)**, **fails to scale**, and **could be more elegant/efficient** before
|
||
any implementation. Findings are prioritised; each is the input to a SHARD-WP-0005 task.
|
||
|
||
## Verdict in one line
|
||
|
||
The **layering and the dual narrow waist are sound and stay**. The **thesis is ~90% right**;
|
||
the missing 10% (curatorial / coordination-canonical state) breaks its clean story. There are
|
||
**two genuine bugs**, **two large unaddressed scaling risks**, and several **elegance/efficiency
|
||
debts** — all fixable without touching INTENT.
|
||
|
||
---
|
||
|
||
## A. The framing crack (fix resolves three issues)
|
||
|
||
**A-1 — Two buckets hide a third.** The thesis "canonical at the edges, derived in the middle"
|
||
omits **born-in-the-middle-but-canonical** state: overlays that are the local truth against a
|
||
read-only shard (Flow C), manual **curator equivalence bindings**, alias tables, merge
|
||
decisions. These encode human judgment or local-only content and **cannot be rebuilt** from
|
||
shards+journal.
|
||
|
||
**Contradiction:** I-2 declares L4 rebuildable, yet §8.4 puts "alias table, curator binding"
|
||
in L4. You cannot rebuild a curator's manual binding.
|
||
|
||
**Fix:** three states — **sharded-canonical**, **coordination-canonical** (journal: overlays,
|
||
bindings, aliases, merges — durable, born in the middle), **derived-disposable** (union graph,
|
||
indexes, projections). Re-frame §1 as **canonical (sharded + coordination) vs derived
|
||
(disposable)**; `derived = f(canonical)` then becomes actually true. → **T1**
|
||
|
||
---
|
||
|
||
## B. Where it breaks (correctness)
|
||
|
||
**B-1 — Identity conflated with content-fingerprint (BUG).** §7.2 derives page identity from
|
||
content fingerprint. That makes **editing a page change its identity**, breaking every
|
||
reference. Fingerprints identify *versions/equivalence*, not *identity*. Page identity must be
|
||
a **stable handle (uid)** surviving edits; fingerprints belong to the **equivalence** mechanism
|
||
(§8.4). One word, two concepts, wrong implementation for the stable one. → **T2**
|
||
|
||
**B-2 — No concurrency/consistency model.** Concurrent overlays on one page, overlay applied
|
||
after source drift, journal-commit vs shard-native-write ordering — all undefined. Conflict
|
||
handling is deferred to "policy presets," but **conflict *detection + representation* is core
|
||
mechanism**; only *resolution* is policy. The union's consistency guarantee is unstated
|
||
(eventually-consistent? read-your-writes? causal-via-journal?). → **T3**
|
||
|
||
**B-3 — Persisted union cache + multi-tenant = leak surface.** §13 recommends a persisted L4
|
||
cache; §9 protects content by *read-time* filtering on the provenance envelope. A persisted
|
||
cross-tenant union cache guarded only by read-time filtering is an L4 attack surface. Tension
|
||
between I-2 (persisted rebuildable cache), scale, and L5 isolation is unacknowledged. → **T8**
|
||
|
||
---
|
||
|
||
## C. Where it fails to scale
|
||
|
||
**C-1 — Equivalence detection is O(N²), no indexing/incremental story.** Fingerprint /
|
||
span-set-overlap across all pages of all shards is combinatorial (10 shards × 100k pages ≈
|
||
10¹² comparisons). No blocking/LSH/indexing, no incremental maintenance. Biggest scaling
|
||
hazard in the document. → **T4**
|
||
|
||
**C-2 — "Rebuildable cache" collides with the operational-envelope axis.** A byte-exact
|
||
rebuild requires reading *every page of every shard*, including rate-limited/paginated
|
||
external APIs (Notion) and irreducibly-live sources — hours-to-days. I-2 contradicts axis-10.
|
||
**Incremental, change-driven maintenance must be primary** (notify→delta), rebuild a rare
|
||
fallback. Cache invalidation — the actual hard problem — is named once and never designed. →
|
||
**T4, T5**
|
||
|
||
**C-3 — Unbounded history at open L0 = DoS/perf.** "Every write a commit" + "open for all" ⇒
|
||
the git journal grows without bound under bots/vandalism and git degrades on huge histories.
|
||
"History is the floor" has an unacknowledged cost: packing, compaction, per-shard offload. →
|
||
**T8**
|
||
|
||
---
|
||
|
||
## D. Elegance / efficiency debts
|
||
|
||
**D-1 — The 15 spectra assert a clean degradation function never demonstrated.** Either most
|
||
axes are irrelevant to most ops (then the 15-D profile is ceremony), or behavior depends on
|
||
several axes *jointly* (then "no per-backend code" becomes a sprawling axis-interaction matrix
|
||
— the flat-checklist problem in higher dimensions). And the axes **aren't orthogonal**
|
||
(git-native history ⟺ git-IS-store ⟺ git/text merge; encrypted opacity ⟹ query/translation
|
||
collapse). Model a **smaller orthogonal core** + **derived/implied** positions, and state the
|
||
**axis-interaction subset** the degradation logic truly uses. → **T6**
|
||
|
||
**D-2 — Provenance envelope isn't inherited; it'll dwarf the content.** Per-span envelopes at
|
||
block granularity = 10k near-identical envelopes for a 10k-block graph. The doc already
|
||
invented the right pattern for Trilium ("effective-vs-own with per-attribute provenance") and
|
||
failed to apply it to its own envelope. Make provenance **layered (page envelope + span
|
||
deltas)**. → **T7**
|
||
|
||
**D-3 — Projection machinery over-fit to the exotic tail.** Two-axis model + three facets +
|
||
view registry exist mostly for UC-83/84 (2 of 84 UCs); the 95% case (markdown in git) pays the
|
||
weight. Make the **common case trivial** (default = plain lazy replication) and
|
||
derivation/liveness an **extension point**, not a taxonomy every projection instantiates. →
|
||
**T7**
|
||
|
||
**D-4 — Cross-cutting rails are the highest-coupling components, presented as clean.**
|
||
`provenance/` and capability types are imported by every layer (god-modules); an envelope
|
||
change ripples everywhere. And **policy has no module** (§10 enumerates it; §11 omits it)
|
||
despite being consulted by L3/L4/L5. Give policy a home; pin the rails behind stable narrow
|
||
interfaces. → **T7**
|
||
|
||
---
|
||
|
||
## E. What explicitly stays
|
||
|
||
- The 6-layer model + the dual narrow waist (adapter contract / page model).
|
||
- Capability-as-data (I-3), union-without-erasure (I-4), overlay-before-mutation (I-5),
|
||
Git-addressable coordination (I-6), mechanism-over-policy (I-7), graceful degradation (I-8).
|
||
- The federation-model taxonomy and the auth ladder (ArchitectureBlueprint.md).
|
||
|
||
## F. Disposition
|
||
|
||
Some findings are **solvable now** (A-1, B-1, D-2, D-3, D-4, C-3); some are **partially open**
|
||
and should be tracked honestly rather than pretend-solved (B-2 consistency model: pick a
|
||
guarantee; C-1 equivalence-at-scale: pick a blocking strategy; D-1 axis interactions: enumerate
|
||
the real subset). SHARD-WP-0005 closes the solvable ones and records the open ones in a new
|
||
"Known scaling risks & open problems" section of the blueprint. → **T9**
|