generated from coulomb/repo-seed
history+workplan: CoreArchitectureBlueprint review; SHARD-WP-0005 hardening
Records the critical review (history/260615-...) and establishes SHARD-WP-0005 to fold its findings (A-1, B-1..B-3, C-1..C-3, D-1..D-4) into the blueprint: correctness (state re-frame, identity/equivalence split, consistency model), scale (incremental-first union, equivalence indexing, cache invalidation), elegance (orthogonal spectra, layered provenance, common-case projection, policy module), security/history scaling, and a known-open-problems section. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
122
history/260615-core-architecture-blueprint-review.md
Normal file
122
history/260615-core-architecture-blueprint-review.md
Normal file
@@ -0,0 +1,122 @@
|
|||||||
|
# Critical review — CoreArchitectureBlueprint.md
|
||||||
|
|
||||||
|
Date: 2026-06-15 · Reviewer: tegwick (with Claude) · Subject:
|
||||||
|
`spec/CoreArchitectureBlueprint.md` @ commit **9b5b393** · Feeds: **SHARD-WP-0005**
|
||||||
|
|
||||||
|
A deliberately hostile review of the first whole-system architecture, to find where it
|
||||||
|
**breaks (correctness)**, **fails to scale**, and **could be more elegant/efficient** before
|
||||||
|
any implementation. Findings are prioritised; each is the input to a SHARD-WP-0005 task.
|
||||||
|
|
||||||
|
## Verdict in one line
|
||||||
|
|
||||||
|
The **layering and the dual narrow waist are sound and stay**. The **thesis is ~90% right**;
|
||||||
|
the missing 10% (curatorial / coordination-canonical state) breaks its clean story. There are
|
||||||
|
**two genuine bugs**, **two large unaddressed scaling risks**, and several **elegance/efficiency
|
||||||
|
debts** — all fixable without touching INTENT.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## A. The framing crack (fix resolves three issues)
|
||||||
|
|
||||||
|
**A-1 — Two buckets hide a third.** The thesis "canonical at the edges, derived in the middle"
|
||||||
|
omits **born-in-the-middle-but-canonical** state: overlays that are the local truth against a
|
||||||
|
read-only shard (Flow C), manual **curator equivalence bindings**, alias tables, merge
|
||||||
|
decisions. These encode human judgment or local-only content and **cannot be rebuilt** from
|
||||||
|
shards+journal.
|
||||||
|
|
||||||
|
**Contradiction:** I-2 declares L4 rebuildable, yet §8.4 puts "alias table, curator binding"
|
||||||
|
in L4. You cannot rebuild a curator's manual binding.
|
||||||
|
|
||||||
|
**Fix:** three states — **sharded-canonical**, **coordination-canonical** (journal: overlays,
|
||||||
|
bindings, aliases, merges — durable, born in the middle), **derived-disposable** (union graph,
|
||||||
|
indexes, projections). Re-frame §1 as **canonical (sharded + coordination) vs derived
|
||||||
|
(disposable)**; `derived = f(canonical)` then becomes actually true. → **T1**
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## B. Where it breaks (correctness)
|
||||||
|
|
||||||
|
**B-1 — Identity conflated with content-fingerprint (BUG).** §7.2 derives page identity from
|
||||||
|
content fingerprint. That makes **editing a page change its identity**, breaking every
|
||||||
|
reference. Fingerprints identify *versions/equivalence*, not *identity*. Page identity must be
|
||||||
|
a **stable handle (uid)** surviving edits; fingerprints belong to the **equivalence** mechanism
|
||||||
|
(§8.4). One word, two concepts, wrong implementation for the stable one. → **T2**
|
||||||
|
|
||||||
|
**B-2 — No concurrency/consistency model.** Concurrent overlays on one page, overlay applied
|
||||||
|
after source drift, journal-commit vs shard-native-write ordering — all undefined. Conflict
|
||||||
|
handling is deferred to "policy presets," but **conflict *detection + representation* is core
|
||||||
|
mechanism**; only *resolution* is policy. The union's consistency guarantee is unstated
|
||||||
|
(eventually-consistent? read-your-writes? causal-via-journal?). → **T3**
|
||||||
|
|
||||||
|
**B-3 — Persisted union cache + multi-tenant = leak surface.** §13 recommends a persisted L4
|
||||||
|
cache; §9 protects content by *read-time* filtering on the provenance envelope. A persisted
|
||||||
|
cross-tenant union cache guarded only by read-time filtering is an L4 attack surface. Tension
|
||||||
|
between I-2 (persisted rebuildable cache), scale, and L5 isolation is unacknowledged. → **T8**
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## C. Where it fails to scale
|
||||||
|
|
||||||
|
**C-1 — Equivalence detection is O(N²), no indexing/incremental story.** Fingerprint /
|
||||||
|
span-set-overlap across all pages of all shards is combinatorial (10 shards × 100k pages ≈
|
||||||
|
10¹² comparisons). No blocking/LSH/indexing, no incremental maintenance. Biggest scaling
|
||||||
|
hazard in the document. → **T4**
|
||||||
|
|
||||||
|
**C-2 — "Rebuildable cache" collides with the operational-envelope axis.** A byte-exact
|
||||||
|
rebuild requires reading *every page of every shard*, including rate-limited/paginated
|
||||||
|
external APIs (Notion) and irreducibly-live sources — hours-to-days. I-2 contradicts axis-10.
|
||||||
|
**Incremental, change-driven maintenance must be primary** (notify→delta), rebuild a rare
|
||||||
|
fallback. Cache invalidation — the actual hard problem — is named once and never designed. →
|
||||||
|
**T4, T5**
|
||||||
|
|
||||||
|
**C-3 — Unbounded history at open L0 = DoS/perf.** "Every write a commit" + "open for all" ⇒
|
||||||
|
the git journal grows without bound under bots/vandalism and git degrades on huge histories.
|
||||||
|
"History is the floor" has an unacknowledged cost: packing, compaction, per-shard offload. →
|
||||||
|
**T8**
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## D. Elegance / efficiency debts
|
||||||
|
|
||||||
|
**D-1 — The 15 spectra assert a clean degradation function never demonstrated.** Either most
|
||||||
|
axes are irrelevant to most ops (then the 15-D profile is ceremony), or behavior depends on
|
||||||
|
several axes *jointly* (then "no per-backend code" becomes a sprawling axis-interaction matrix
|
||||||
|
— the flat-checklist problem in higher dimensions). And the axes **aren't orthogonal**
|
||||||
|
(git-native history ⟺ git-IS-store ⟺ git/text merge; encrypted opacity ⟹ query/translation
|
||||||
|
collapse). Model a **smaller orthogonal core** + **derived/implied** positions, and state the
|
||||||
|
**axis-interaction subset** the degradation logic truly uses. → **T6**
|
||||||
|
|
||||||
|
**D-2 — Provenance envelope isn't inherited; it'll dwarf the content.** Per-span envelopes at
|
||||||
|
block granularity = 10k near-identical envelopes for a 10k-block graph. The doc already
|
||||||
|
invented the right pattern for Trilium ("effective-vs-own with per-attribute provenance") and
|
||||||
|
failed to apply it to its own envelope. Make provenance **layered (page envelope + span
|
||||||
|
deltas)**. → **T7**
|
||||||
|
|
||||||
|
**D-3 — Projection machinery over-fit to the exotic tail.** Two-axis model + three facets +
|
||||||
|
view registry exist mostly for UC-83/84 (2 of 84 UCs); the 95% case (markdown in git) pays the
|
||||||
|
weight. Make the **common case trivial** (default = plain lazy replication) and
|
||||||
|
derivation/liveness an **extension point**, not a taxonomy every projection instantiates. →
|
||||||
|
**T7**
|
||||||
|
|
||||||
|
**D-4 — Cross-cutting rails are the highest-coupling components, presented as clean.**
|
||||||
|
`provenance/` and capability types are imported by every layer (god-modules); an envelope
|
||||||
|
change ripples everywhere. And **policy has no module** (§10 enumerates it; §11 omits it)
|
||||||
|
despite being consulted by L3/L4/L5. Give policy a home; pin the rails behind stable narrow
|
||||||
|
interfaces. → **T7**
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## E. What explicitly stays
|
||||||
|
|
||||||
|
- The 6-layer model + the dual narrow waist (adapter contract / page model).
|
||||||
|
- Capability-as-data (I-3), union-without-erasure (I-4), overlay-before-mutation (I-5),
|
||||||
|
Git-addressable coordination (I-6), mechanism-over-policy (I-7), graceful degradation (I-8).
|
||||||
|
- The federation-model taxonomy and the auth ladder (ArchitectureBlueprint.md).
|
||||||
|
|
||||||
|
## F. Disposition
|
||||||
|
|
||||||
|
Some findings are **solvable now** (A-1, B-1, D-2, D-3, D-4, C-3); some are **partially open**
|
||||||
|
and should be tracked honestly rather than pretend-solved (B-2 consistency model: pick a
|
||||||
|
guarantee; C-1 equivalence-at-scale: pick a blocking strategy; D-1 axis interactions: enumerate
|
||||||
|
the real subset). SHARD-WP-0005 closes the solvable ones and records the open ones in a new
|
||||||
|
"Known scaling risks & open problems" section of the blueprint. → **T9**
|
||||||
@@ -1,8 +1,17 @@
|
|||||||
# history/
|
# history/
|
||||||
|
|
||||||
Archived material that is no longer needed for daily work but should be kept.
|
Archived material and the project's **meta-history**: finished/canceled workplans kept for
|
||||||
|
the record, plus durable **reviews, critical assessments, and decision records** — the
|
||||||
|
reasoning behind the specs, captured at a point in time.
|
||||||
|
|
||||||
Use a `yymmdd-` prefix when archiving files or directories. Content here is
|
Use a `yymmdd-` prefix. Archived material is **out of scope** for regular tasks (consult only
|
||||||
**out of scope** for regular tasks — consult only for research or diagnostics.
|
for research or diagnostics); assessment/review records are point-in-time and may seed active
|
||||||
|
workplans, but are not edited after the fact — supersede with a new dated record and link back.
|
||||||
|
|
||||||
Finished or canceled workplans from `workplans/` are archived here.
|
Distinct from the **coordination journal** (a runtime Git-backed record of *content* change
|
||||||
|
flows inside an information space, an INTENT domain concept); `history/` is the *project's own*
|
||||||
|
design evolution.
|
||||||
|
|
||||||
|
| Date | Record | Subject |
|
||||||
|
|------|--------|---------|
|
||||||
|
| 2026-06-15 | `260615-core-architecture-blueprint-review.md` | Critical review of `spec/CoreArchitectureBlueprint.md` (commit 9b5b393); inputs to `SHARD-WP-0005` |
|
||||||
226
workplans/SHARD-WP-0005-architecture-hardening.md
Normal file
226
workplans/SHARD-WP-0005-architecture-hardening.md
Normal file
@@ -0,0 +1,226 @@
|
|||||||
|
---
|
||||||
|
id: SHARD-WP-0005
|
||||||
|
type: workplan
|
||||||
|
title: "core architecture hardening (blueprint review fixes)"
|
||||||
|
domain: whynot
|
||||||
|
repo: shard-wiki
|
||||||
|
status: active
|
||||||
|
owner: tegwick
|
||||||
|
topic_slug: whynot
|
||||||
|
created: "2026-06-15"
|
||||||
|
updated: "2026-06-15"
|
||||||
|
depends_on:
|
||||||
|
- SHARD-WP-0002
|
||||||
|
---
|
||||||
|
|
||||||
|
# SHARD-WP-0005 — Core architecture hardening
|
||||||
|
|
||||||
|
## Goal
|
||||||
|
|
||||||
|
Resolve the findings of the critical review
|
||||||
|
(`history/260615-core-architecture-blueprint-review.md`) by hardening
|
||||||
|
`spec/CoreArchitectureBlueprint.md` for **correctness, scale, and elegance** before
|
||||||
|
implementation. Close every *solvable* finding; record every *partially-open* finding
|
||||||
|
explicitly (consistency model, equivalence-at-scale strategy, axis-interaction subset) rather
|
||||||
|
than pretending it is solved.
|
||||||
|
|
||||||
|
Primary deliverable: a revised `spec/CoreArchitectureBlueprint.md` (the review's A–F findings
|
||||||
|
folded in) plus a new **"Known scaling risks & open problems"** section.
|
||||||
|
|
||||||
|
## Context
|
||||||
|
|
||||||
|
- Review: `history/260615-core-architecture-blueprint-review.md` (findings A-1, B-1…B-3,
|
||||||
|
C-1…C-3, D-1…D-4; disposition F).
|
||||||
|
- Architecture under revision: `spec/CoreArchitectureBlueprint.md` @ 9b5b393.
|
||||||
|
- Constraints: `INTENT.md` (no amendment expected — all fixes live inside existing
|
||||||
|
boundaries); the synthesis inputs already folded into `SHARD-WP-0002`.
|
||||||
|
|
||||||
|
**Non-goal:** Implement anything. This workplan revises the architecture spec only.
|
||||||
|
|
||||||
|
## Guiding aims
|
||||||
|
|
||||||
|
- **Elegance:** prefer fewer, orthogonal concepts; make the common case trivial and the
|
||||||
|
exotic case possible (not the reverse).
|
||||||
|
- **No pretend-solved:** an honestly-open problem with a chosen direction beats a hand-wave.
|
||||||
|
- **INTENT-preserving:** every change must still honour the 12 invariants (or revise an
|
||||||
|
invariant deliberately and say so).
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Re-frame the state model: canonical (sharded + coordination) vs derived
|
||||||
|
|
||||||
|
```task
|
||||||
|
id: SHARD-WP-0005-T1
|
||||||
|
status: todo
|
||||||
|
priority: high
|
||||||
|
```
|
||||||
|
|
||||||
|
Fix finding **A-1** (and its I-2 contradiction). Replace the two-bucket thesis with **three
|
||||||
|
states**: **sharded-canonical** (shard content), **coordination-canonical** (journal:
|
||||||
|
overlays, curator equivalence bindings, alias tables, merge decisions — durable, born in the
|
||||||
|
middle), **derived-disposable** (union graph, indexes, projections). Re-frame §1 as
|
||||||
|
**canonical (sharded + coordination) vs derived (disposable)**; make `derived = f(canonical)`
|
||||||
|
literally true. Update I-2, the §3 dependency rule (only the disposable tier is rebuildable),
|
||||||
|
§4 abstractions (name coordination-canonical state), and move "alias table / curator binding"
|
||||||
|
out of L4-rebuildable into the coordination tier.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Split page identity from content equivalence
|
||||||
|
|
||||||
|
```task
|
||||||
|
id: SHARD-WP-0005-T2
|
||||||
|
status: todo
|
||||||
|
priority: high
|
||||||
|
```
|
||||||
|
|
||||||
|
Fix bug **B-1**. Separate two concepts §7.2/§8.4 conflate: **page identity** = a *stable
|
||||||
|
handle* (shard-scoped uid, name-based, survives edits) used for references/placement; **content
|
||||||
|
equivalence** = fingerprint / span-set overlap used to *detect sameness*, never as identity.
|
||||||
|
State that a fingerprint identifies a *version/content*, not a *page*. Reconcile with
|
||||||
|
identity≠placement (I-9): identity (stable) → placements (N) → equivalence (cross-identity
|
||||||
|
sameness).
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Consistency, concurrency & conflict model
|
||||||
|
|
||||||
|
```task
|
||||||
|
id: SHARD-WP-0005-T3
|
||||||
|
status: todo
|
||||||
|
priority: high
|
||||||
|
```
|
||||||
|
|
||||||
|
Fix bug **B-2**. Add a new section stating shard-wiki's **consistency guarantee** (choose and
|
||||||
|
justify: e.g. causal consistency via the coordination journal; read-your-writes for local
|
||||||
|
overlays; eventual convergence for projected union). Specify **conflict detection +
|
||||||
|
representation as core mechanism** (divergence detection, keep-both/coexist representation),
|
||||||
|
keeping only *resolution* as policy (I-7). Define **overlay-apply semantics under source
|
||||||
|
drift** (rebase/refuse/three-way), and journal-commit vs shard-native-write ordering. Mark any
|
||||||
|
residual as open (→ T9).
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Scaling the union: incremental-first, equivalence indexing, rebuild-as-fallback
|
||||||
|
|
||||||
|
```task
|
||||||
|
id: SHARD-WP-0005-T4
|
||||||
|
status: todo
|
||||||
|
priority: high
|
||||||
|
```
|
||||||
|
|
||||||
|
Fix scaling findings **C-1, C-2**. Make **incremental, change-driven maintenance the primary
|
||||||
|
mechanism** for the derived tier: the `notify` capability (or poll/ETag fallback) drives
|
||||||
|
**delta updates** to union/index/projections; full rebuild is a rare fallback (and explicitly
|
||||||
|
*not required* to be cheap for rate-limited shards — reconcile with axis-10). Replace O(N²)
|
||||||
|
equivalence with a **blocking/indexing strategy** (normalised-title/path buckets, fingerprint
|
||||||
|
shingling/LSH, candidate generation then verify) and **incremental equivalence maintenance**.
|
||||||
|
Update §8.4 and I-2 (rebuildability is a *correctness property of the disposable tier*, not an
|
||||||
|
operational expectation).
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Cache freshness & invalidation protocol
|
||||||
|
|
||||||
|
```task
|
||||||
|
id: SHARD-WP-0005-T5
|
||||||
|
status: todo
|
||||||
|
priority: medium
|
||||||
|
```
|
||||||
|
|
||||||
|
Fix finding **C-2 (invalidation)**. Design the replication-projection **freshness/invalidation
|
||||||
|
protocol**: staleness semantics (TTL vs event-driven), push (notify/webhook/ActivityPub) vs
|
||||||
|
poll (ETag/If-Modified) vs hybrid per capability profile, single-flight / coalescing to avoid
|
||||||
|
thundering-herd refetch, and how freshness is surfaced in the provenance envelope. Tie to the
|
||||||
|
operational-envelope axis (rate-limited shards favour event-driven + long TTL).
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Capability spectra: orthogonal core, implied positions, interaction subset
|
||||||
|
|
||||||
|
```task
|
||||||
|
id: SHARD-WP-0005-T6
|
||||||
|
status: todo
|
||||||
|
priority: medium
|
||||||
|
```
|
||||||
|
|
||||||
|
Fix elegance finding **D-1**. Identify a **smaller orthogonal core** of capability axes and
|
||||||
|
mark the rest as **derived/implied** (e.g. attachment=git-IS-store ⟹ history=git-native ⟹
|
||||||
|
merge=git/text; opacity=encrypted ⟹ query/translation degrade). Explicitly enumerate the
|
||||||
|
**axis-interaction subset** the degradation function actually depends on (so "no per-backend
|
||||||
|
code" is a demonstrated claim, not an assertion), and forbid impossible profiles via the
|
||||||
|
implied-position rules. Update §6.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Elegance pass: layered provenance, common-case projection, policy module & rails
|
||||||
|
|
||||||
|
```task
|
||||||
|
id: SHARD-WP-0005-T7
|
||||||
|
status: todo
|
||||||
|
priority: medium
|
||||||
|
```
|
||||||
|
|
||||||
|
Fix findings **D-2, D-3, D-4** together (the structural elegance/efficiency cluster):
|
||||||
|
|
||||||
|
- **Layered provenance** (D-2): page-level envelope + span-level *deltas* (the same
|
||||||
|
effective-vs-own pattern used for Trilium metadata), so per-span cost is near-zero when
|
||||||
|
uniform. Update §4/§7.2 and the provenance rail.
|
||||||
|
- **Common-case-trivial projection** (D-3): default = plain lazy replication-projection;
|
||||||
|
derivation/liveness/view-registry become an **extension point** invoked only for
|
||||||
|
computational/typed content — not a taxonomy every projection instantiates. Re-shape §8.4–8.5.
|
||||||
|
- **Policy module + rail discipline** (D-4): add a `policy/` module owning the §10 surface;
|
||||||
|
pin `provenance/` and capability types behind **stable narrow interfaces** to cap coupling.
|
||||||
|
Update §11 and the dependency rules.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Security/multi-tenancy isolation & history scaling
|
||||||
|
|
||||||
|
```task
|
||||||
|
id: SHARD-WP-0005-T8
|
||||||
|
status: todo
|
||||||
|
priority: medium
|
||||||
|
```
|
||||||
|
|
||||||
|
Fix findings **B-3, C-3**:
|
||||||
|
|
||||||
|
- **Tenant isolation of derived state** (B-3): the persisted derived tier is **partitioned per
|
||||||
|
tenant/root-entity**; no cross-tenant union cache guarded only by read-time filtering.
|
||||||
|
Reconcile I-2 + L5; state the isolation invariant. Update §9/§13.
|
||||||
|
- **History scaling** (C-3): a strategy for unbounded open-L0 history — git packing/gc,
|
||||||
|
**compaction/squash policy for low-value churn**, per-shard history offload, and
|
||||||
|
rate-limiting/anti-abuse hooks — without weakening recoverability (I-10). Update §8.1.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Known scaling risks & open problems; invariant + traceability refresh
|
||||||
|
|
||||||
|
```task
|
||||||
|
id: SHARD-WP-0005-T9
|
||||||
|
status: todo
|
||||||
|
priority: medium
|
||||||
|
```
|
||||||
|
|
||||||
|
Close out (finding **F**). Add a new **"Known scaling risks & open problems"** section listing
|
||||||
|
the partially-open items with their chosen direction and the trigger that would force a
|
||||||
|
revisit (consistency-model edge cases, equivalence-blocking false-negative rate,
|
||||||
|
axis-interaction completeness, persisted-cache cost ceiling). Refresh the **invariants table**
|
||||||
|
(any added/changed invariant), the **§13 decisions** (mark resolved vs still-open), and the
|
||||||
|
**§15 traceability** (link this review + SHARD-WP-0005). Final `check_repo_consistency` pass.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Acceptance criteria
|
||||||
|
|
||||||
|
- Every review finding A-1, B-1–B-3, C-1–C-3, D-1–D-4 is either **resolved in the blueprint**
|
||||||
|
or **listed as a known open problem with a chosen direction** (none silently dropped).
|
||||||
|
- The blueprint still honours all INTENT invariants, or revises one *deliberately and visibly*.
|
||||||
|
- `spec/CoreArchitectureBlueprint.md` reads as **more elegant**, not merely more detailed:
|
||||||
|
fewer/orthogonal core concepts; common case trivial; exotic case possible.
|
||||||
|
- Each task committed; SCOPE/spec-README updated where status changes; state-hub synced.
|
||||||
|
|
||||||
|
## Suggested task order
|
||||||
|
|
||||||
|
Correctness first (**T1 → T2 → T3**), then scale (**T4 → T5**), then elegance (**T6 → T7**),
|
||||||
|
then hardening (**T8**), then close-out (**T9**).
|
||||||
Reference in New Issue
Block a user