research: Federated Wiki deep dive (journal/fork/neighborhood); UC-70-72

SHARD-WP-0003 T1. Federation model (not a shard candidate): per-page
append-only semantic-action journal with story as derived replay,
fork-with-site-provenance, neighborhood/roster discovery + chorus of forks.
Prior art for shard-wiki's own pillars: coordination journal (UC-71),
overlay-before-mutation (UC-26 fork), union-without-erasure (UC-72).
Attach as REST/file-store hybrid (page JSON + CORS, UC-70). Feeds
SHARD-WP-0002 T1-T5, T11, T13, T16.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
2026-06-14 19:01:13 +02:00
parent 9acd4e2841
commit 036dbad816
6 changed files with 383 additions and 5 deletions

View File

@@ -0,0 +1,20 @@
# 260614 — Federated Wiki deep dive
Deep dive on Ward Cunningham's **Federated Wiki** (Smallest Federated Wiki / SFW,
2011 →) as a **federation model** rather than a single shard: fork-with-provenance,
the per-page **JSON journal** of semantic actions, the **story** of typed items, the
**neighborhood/roster** discovery model, and time-bounded **happenings**.
This is prior art for shard-wiki's **coordination layer itself** — the closest existing
system to "a union of pages preserving provenance, assembled non-destructively." It
extends `research/260608-federation-concepts/` §3 with the concrete data model + protocol.
- `findings.md` — full writeup: data model, journal/action types, federation protocol,
capability profile, INTENT mapping, UC seeds (UC-70UC-72), architecture notes for
SHARD-WP-0002, open questions, sources, traceability.
Catalog yield: UC-70 (attach a fedwiki site via page-JSON + CORS), UC-71 (append-only
semantic action journal with site provenance as a coordination-journal model), UC-72
(fork-with-site-provenance federation across a neighborhood / chorus). Enriched
UC-26/28/30/05/27. Feeds SHARD-WP-0002 T1T5 (federation) and T11/T13/T16 (write
granularity, log-based merge, identity≠placement).

View File

@@ -0,0 +1,240 @@
# Federated Wiki — deep dive (findings)
**Date:** 2026-06-14 · **Source:** SHARD-WP-0003 T1 · **Subject:** Ward Cunningham's
Smallest Federated Wiki (SFW) / Federated Wiki (fedwiki ecosystem).
## Why this dive
Every prior dive has been a *shard candidate* — a store we might attach. Federated Wiki
is different: it is a **federation model**, the one piece of public prior art whose core
job is the same as shard-wiki's coordination layer — *present a union of pages from many
independent sites while preserving where each came from, and let people copy and edit
non-destructively*. Ward Cunningham (inventor of the wiki) built SFW in 2011 precisely to
fix the original wiki's single-canonical-page weakness with **fork + provenance**. We go
past the surface (`260608-federation-concepts/` §3) into the data model and protocol, then
ask what shard-wiki should adopt.
**Framing:** fedwiki is not just "a shard we attach" — it is a *worked example of the
coordination journal, overlay-before-mutation, and union-without-erasure*, three of our
own design pillars, shipped and running.
---
## 1. The data model — page = title + story + journal
A fedwiki page is a small JSON object with three core fields (plus optional decoration):
```json
{
"title": "Welcome Visitors",
"story": [
{ "type": "paragraph", "id": "7b56f22a4b9ee974",
"text": "Welcome to this [[Federated Wiki]] site." },
{ "type": "image", "id": "a1c0e3...", "url": "...", "caption": "..." }
],
"journal": [
{ "type": "create", "id": "7b56f22a4b9ee974", "item": {...}, "date": 1310000000000 },
{ "type": "add", "id": "a1c0e3...", "item": {...}, "after": "7b56f22a4b9ee974",
"date": 1310000100000 },
{ "type": "edit", "id": "7b56f22a4b9ee974", "item": {...}, "date": 1310000200000 },
{ "type": "fork", "site": "ward.fed.wiki.org", "date": 1310000300000 }
]
}
```
- **story** — an *ordered array of typed items* ("paragraph-like" items). Each item is
`{ type, id, text, ...type-specific }`. The **`id`** is a random 16-hex string,
**stable across edits** (it is the unit of identity within a page). The **`type`** names
the **plugin** that renders/edits the item (`paragraph`, `image`, `html`, `markdown`,
`code`, `method`, `pagefold`, chart plugins, …). *Data lives in the item; behavior lives
in the plugin* — the item is portable JSON; the plugin is the renderer.
- **journal** — an *ordered, append-only array of action objects* that, when replayed,
**reconstructs the story**. The story is a materialized view of the journal. This is the
key architectural choice: **the journal is the source of truth, the story is derived.**
## 2. Journal action types — a semantic op-log
Each journal entry is an action with `{ type, ... , date }` (epoch-ms). The action types:
| action | fields | meaning |
|---------|--------|---------|
| `create`| `id, item, date` | first item — page born |
| `add` | `id, item, after, date` | insert an item after another |
| `edit` | `id, item, date` | replace an item's content (id preserved) |
| `move` | `order, date` | reorder items |
| `remove`| `id, date` | delete an item |
| `fork` | `site, date` | **mark that the page was copied from `site` at this point** |
Two things matter for us:
1. **These are *semantic* operations** (add/move/edit/remove a paragraph), not text diffs
and not character-level CRDT ops. The write granularity is the **story item
(paragraph)** — a *middle* granularity between whole-file (TiddlyWiki) and
block/character (Logseq/CRDT). It is an **op-log** like a CRDT, but the ops are
coarse-grained and **applied by humans via fork**, not auto-merged.
2. **`fork` is the provenance primitive.** When you copy a remote page to your own site,
a `fork` entry is appended recording the **source site** and time. The journal of a
forked page therefore **serializes a directed acyclic graph (DAG)** of where content
came from — "the journal of a forked page is detailed enough to recognize where in the
journal of the original the fork took place" (CouchDB-style per-entry sequence numbers
make the cut-point identifiable). History visualization highlights the forked entry.
## 3. The federation protocol — sites, neighborhood, roster
- **Site** = an independent server (originally Node.js; also static-file and serverless
variants). A site owns a set of pages, each served as **page JSON over HTTP** at
`/<slug>.json`, with **CORS headers** so a *browser-side* client can fetch pages from
**any** site. Page identity within a site is the **slug** (a title-derived kebab name).
- **The client assembles the union, not the server.** The fedwiki client ("the lineup")
renders pages **side by side**: clicking a link opens that page *from whatever site it
resolves against*, appended to the right. Browsing literally builds a left-to-right
trail across sites.
- **Neighborhood** = the dynamic set of sites encountered in the current session (from the
sites of pages you've opened, links, and forks). **Search runs across the neighborhood**
— a federated search over exactly the sites you've touched.
- **Roster** = an explicit, authored list of sites to include (a curated neighborhood);
"sister sites" are peers you watch. There is **no central registry** — discovery is by
link, fork, and roster.
- **Happenings** = time-bounded collaborative events where many participants fork around a
topic for a period, producing a burst of related forks (a bounded collaboration that
leaves a durable forked record on each participant's own site).
## 4. The editorial model — fork, don't edit-in-place
You can only write to **your own** site. To change someone else's page you **fork** it
(copy into your site, journal records the source), then edit your copy. Many forks of the
same page coexist across sites — Cunningham's **"chorus of voices"**: *no canonical
version*, divergence is normal and visible, and you choose whose changes to pull by forking
them. There is **no automatic merge** — reconciliation is human: compare journals, fork the
version you prefer, optionally re-fork upstream changes.
---
## 5. Capability profile
| Dimension (synthesis spectrum) | Federated Wiki |
|--------------------------------|----------------|
| Attachment mode | **REST/file-store hybrid** — page JSON over HTTP+CORS; also static files |
| Addressing granularity | **story item (paragraph)** via stable 16-hex `id` |
| Content identity | item `id` random+stable; page id = site + slug |
| Identity vs placement | **placement-bound**: identity = `site` + `slug`; forks are *new* identities linked by journal provenance |
| Structure | ordered array of **typed items** (plugin-typed) |
| History | **per-page append-only journal** of semantic actions (op-log) |
| Merge model | **fork + manual journal compare** — a *third model* beside git 3-way and CRDT auto-merge |
| Native query | none built-in; **neighborhood search** (federated full-text across touched sites) |
| Translation | item `text` is wiki/Markdown-ish; plugins own their formats |
| Attachment/write granularity | **story-item level** (add/edit/move/remove one item) |
| Operational envelope | tiny servers, browser-driven; CORS is the whole API surface |
| Access grant | **own-site-only writes**; reads open via CORS |
| Content opacity | transparent JSON (no E2EE); plugin-typed but inspectable |
| Provenance | **first-class**`fork` records source site; journal = provenance DAG |
## 6. INTENT mapping
### Reinforcements (fedwiki validates our pillars)
- **Coordination journal** (INTENT) ≈ fedwiki **journal**. Our journal idea is *exactly*
fedwiki's per-page append-only action log — and fedwiki proves the story-as-derived-view
pattern works. Strong reinforcement; adopt the **semantic-op + provenance-entry** shape.
- **Overlay before mutation** ≈ **fork**. Fork *is* the canonical overlay: a
non-destructive copy onto a writable surface, recording provenance, before any change.
- **Union without erasure** ≈ **neighborhood + chorus**. The union is assembled from many
sovereign sites; provenance (which site, forked-from) is never hidden; divergence is
surfaced, not resolved away.
- **No silent remote mutation** ≈ **own-site-only writes**. You structurally *cannot*
mutate a remote; you fork to your own site. This is our rule, enforced by architecture.
- **Mechanism over policy** ≈ **no canonical source**. Fedwiki ships the mechanism (fork,
journal, neighborhood) and leaves "which version wins" entirely to people.
- **Graceful degradation** ≈ static-file sites — a fedwiki site can be a read-only pile of
JSON files; still forkable, still in the neighborhood.
### Divergences (boundaries / design notes, not bugs)
- **Identity = placement.** Fedwiki page identity is `site` + `slug`; a fork is a *new*
page whose only tie to the origin is a journal `fork` entry. shard-wiki wants
**identity ≠ placement** (the "same" page across shards under a stable identity, T16) —
so we treat fedwiki's journal-linked forks as *provenance edges*, and layer our own
cross-shard identity over them rather than adopting slug-as-identity.
- **No query / no typed-record model.** Fedwiki is paragraphs+plugins, not a typed DB
(contrast Notion/Wikibase). Fine — it sits at the *coordination* end, not the structure
end. We don't ask fedwiki to provide query; the neighborhood search is the model for
*federated* search across shards (T-federation), not in-shard query.
- **Browser-assembles-union.** Fedwiki pushes union assembly to the client. shard-wiki
assembles server/orchestrator-side. Adopt the *model* (union from sovereign sources +
provenance), not the client-only locus.
### What to keep
1. **Journal = append-only semantic-op log with provenance entries**, story = derived
replay view. This is the concrete shape for our coordination journal (T13).
2. **Fork-with-source-attribution** as the overlay/adopt primitive across shards.
3. **Neighborhood** as the model for a *dynamic, link-and-fork-discovered* federated set +
search, with **roster** as the curated/explicit variant.
4. **Chorus of forks** — represent divergent versions across shards as co-equal, linked by
provenance, with reconciliation as an explicit human/policy step (mechanism over policy).
---
## 7. UC seeds
| # | Seed | Disposition |
|---|------|-------------|
| UC-70 | Attach a Federated Wiki site as a shard via its **page JSON + CORS** (REST/file-store hybrid); project pages, fork to overlay | **new** |
| UC-71 | Adopt a **per-page append-only semantic-action journal with provenance entries** (fork=source site) as the coordination-journal model — replay to materialize, compare to locate divergence | **new** |
| UC-72 | **Fork-with-site-provenance federation across a neighborhood** of peer shards — assemble a union from links/forks, search across it, preserve the chorus without forcing a canonical | **new** |
| — | fork-with-provenance as overlay/adopt | enrich **UC-26** (fork) |
| — | carry-forward of forked content + upstream re-fork | enrich **UC-28** (carry-forward) |
| — | happenings = time-bounded collaboration leaving durable forks | enrich **UC-30** (time-bounded space) |
| — | union/chorus of co-equal versions, provenance-linked | enrich **UC-05 / UC-27** |
## 8. Architecture notes for SHARD-WP-0002
- **T1T5 (federation):** fedwiki is the reference design. The **journal** (append-only,
semantic ops, fork-provenance) is the concrete coordination-journal shape; **neighborhood
+ roster** is the discovery/membership model (dynamic vs curated); **fork** is the
overlay/adopt op. Model the union as an assembly over sovereign sources with provenance
edges, reconciliation left to policy.
- **T11 (capability/write-granularity):** add **story-item / paragraph** as a named
write-granularity tier between whole-file and block/character.
- **T13 (history portability / merge model):** record fedwiki's **journal-replay op-log**
as a *third merge model* beside git 3-way and CRDT auto-merge — a **coarse semantic
op-log applied manually via fork**. A shard whose history *is* such a journal can supply
our coordination journal almost directly (vs git-commit import or CRDT-update import).
- **T16 (identity ≠ placement):** fedwiki's `fork` journal entries are **provenance edges**
between same-named pages on different sites — exactly the cross-shard "same page,
different placement" relation we must model. Use them as edges; keep our own identity
layer above slug.
## 9. Open questions
1. Should shard-wiki's coordination journal adopt fedwiki's **exact action vocabulary**
(create/add/edit/move/remove/fork) at the page-item level, or a more granular/abstract
op set that other shards can also emit?
2. Is **neighborhood** (dynamic, link/fork-discovered) a first-class membership mode for an
information space, or only a *view* over an explicitly-configured shard set (roster)?
3. How do we reconcile fedwiki's **slug-as-identity + fork-DAG** with our intended
**stable cross-shard identity** (T16) — promote fork edges into the identity graph, or
keep them as provenance-only annotations?
4. Does the **chorus / no-canonical** stance compose with shards that *do* assert a
canonical (Notion, an upstream git main)? (policy-selectable canonical over a
mechanism that permits chorus.)
## 10. Sources
- Smallest Federated Wiki wiki: **Story JSON**, **Federation Details**
github.com/WardCunningham/Smallest-Federated-Wiki/wiki
- JSON Schema notes — song.fed.wiki.org/json-schema.html
- "Smallest Federated Wiki" — home.c2.com/smallest-federated-wiki.html
- Federated Wiki — federated.wiki (Visualizing Page History)
- Mike Caulfield, "The OER Case for Federated Wiki" — hapgood.us (2015)
- Jon Udell, "A federated Wikipedia" — blog.jonudell.net (2015)
- Wikipedia: *Federated Wiki*; IndieWeb: *Smallest Federated Wiki*
- fedwiki/wiki-plugin-transport (plugin/transport reference)
- prior: `research/260608-federation-concepts/` §3
## 11. Traceability
New UCs **UC-70UC-72** carry the marker **⊞** in the wikiengines column of
`spec/UseCaseCatalog.md` (true lineage = this dive; placed in the nearest existing column).
Enriched: UC-26, UC-28, UC-30, UC-05, UC-27. Architecture cross-refs: SHARD-WP-0002
T1T5, T11, T13, T16.

View File

@@ -27,4 +27,5 @@ when multiple files or sources are involved. Findings here inform `spec/` and
| 2026-06-14 | `260614-logseq-deep-dive/` | Logseq — block-graph on plain Markdown files, in-file block IDs, derived Datalog index; UC-62/63 |
| 2026-06-14 | `260614-localfirst-workspaces-deep-dive/` | Anytype · AFFiNE · AppFlowy — CRDT local-first workspaces (any-sync/Yjs/Yrs), native merge, P2P/E2EE; UC-64/65 |
| 2026-06-14 | `260614-trilium-deep-dive/` | Trilium/TriliumNext — note cloning (DAG hierarchy), attribute inheritance/templates, HTML-native, scripting+ETAPI; UC-66/67 |
| 2026-06-14 | `260614-wikijs-deep-dive/` | Wiki.js — storage-module engine (DB↔Git Markdown), GraphQL API, pluggable modules ≈ adapter-contract prior art; UC-68/69 |
| 2026-06-14 | `260614-wikijs-deep-dive/` | Wiki.js — storage-module engine (DB↔Git Markdown), GraphQL API, pluggable modules ≈ adapter-contract prior art; UC-68/69 |
| 2026-06-14 | `260614-federated-wiki-deep-dive/` | Federated Wiki — fork-with-provenance, per-page semantic-action journal (story=replay), neighborhood/roster + chorus; prior art for our coordination journal / overlay / union pillars; UC-70/71/72 |