Files
shard-wiki/research/260614-federated-wiki-deep-dive/findings.md
tegwick 036dbad816 research: Federated Wiki deep dive (journal/fork/neighborhood); UC-70-72
SHARD-WP-0003 T1. Federation model (not a shard candidate): per-page
append-only semantic-action journal with story as derived replay,
fork-with-site-provenance, neighborhood/roster discovery + chorus of forks.
Prior art for shard-wiki's own pillars: coordination journal (UC-71),
overlay-before-mutation (UC-26 fork), union-without-erasure (UC-72).
Attach as REST/file-store hybrid (page JSON + CORS, UC-70). Feeds
SHARD-WP-0002 T1-T5, T11, T13, T16.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-14 19:01:13 +02:00

14 KiB
Raw Blame History

Federated Wiki — deep dive (findings)

Date: 2026-06-14 · Source: SHARD-WP-0003 T1 · Subject: Ward Cunningham's Smallest Federated Wiki (SFW) / Federated Wiki (fedwiki ecosystem).

Why this dive

Every prior dive has been a shard candidate — a store we might attach. Federated Wiki is different: it is a federation model, the one piece of public prior art whose core job is the same as shard-wiki's coordination layer — present a union of pages from many independent sites while preserving where each came from, and let people copy and edit non-destructively. Ward Cunningham (inventor of the wiki) built SFW in 2011 precisely to fix the original wiki's single-canonical-page weakness with fork + provenance. We go past the surface (260608-federation-concepts/ §3) into the data model and protocol, then ask what shard-wiki should adopt.

Framing: fedwiki is not just "a shard we attach" — it is a worked example of the coordination journal, overlay-before-mutation, and union-without-erasure, three of our own design pillars, shipped and running.


1. The data model — page = title + story + journal

A fedwiki page is a small JSON object with three core fields (plus optional decoration):

{
  "title": "Welcome Visitors",
  "story": [
    { "type": "paragraph", "id": "7b56f22a4b9ee974",
      "text": "Welcome to this [[Federated Wiki]] site." },
    { "type": "image", "id": "a1c0e3...", "url": "...", "caption": "..." }
  ],
  "journal": [
    { "type": "create", "id": "7b56f22a4b9ee974", "item": {...}, "date": 1310000000000 },
    { "type": "add",  "id": "a1c0e3...", "item": {...}, "after": "7b56f22a4b9ee974",
      "date": 1310000100000 },
    { "type": "edit", "id": "7b56f22a4b9ee974", "item": {...}, "date": 1310000200000 },
    { "type": "fork", "site": "ward.fed.wiki.org", "date": 1310000300000 }
  ]
}
  • story — an ordered array of typed items ("paragraph-like" items). Each item is { type, id, text, ...type-specific }. The id is a random 16-hex string, stable across edits (it is the unit of identity within a page). The type names the plugin that renders/edits the item (paragraph, image, html, markdown, code, method, pagefold, chart plugins, …). Data lives in the item; behavior lives in the plugin — the item is portable JSON; the plugin is the renderer.
  • journal — an ordered, append-only array of action objects that, when replayed, reconstructs the story. The story is a materialized view of the journal. This is the key architectural choice: the journal is the source of truth, the story is derived.

2. Journal action types — a semantic op-log

Each journal entry is an action with { type, ... , date } (epoch-ms). The action types:

action fields meaning
create id, item, date first item — page born
add id, item, after, date insert an item after another
edit id, item, date replace an item's content (id preserved)
move order, date reorder items
remove id, date delete an item
fork site, date mark that the page was copied from site at this point

Two things matter for us:

  1. These are semantic operations (add/move/edit/remove a paragraph), not text diffs and not character-level CRDT ops. The write granularity is the story item (paragraph) — a middle granularity between whole-file (TiddlyWiki) and block/character (Logseq/CRDT). It is an op-log like a CRDT, but the ops are coarse-grained and applied by humans via fork, not auto-merged.
  2. fork is the provenance primitive. When you copy a remote page to your own site, a fork entry is appended recording the source site and time. The journal of a forked page therefore serializes a directed acyclic graph (DAG) of where content came from — "the journal of a forked page is detailed enough to recognize where in the journal of the original the fork took place" (CouchDB-style per-entry sequence numbers make the cut-point identifiable). History visualization highlights the forked entry.

3. The federation protocol — sites, neighborhood, roster

  • Site = an independent server (originally Node.js; also static-file and serverless variants). A site owns a set of pages, each served as page JSON over HTTP at /<slug>.json, with CORS headers so a browser-side client can fetch pages from any site. Page identity within a site is the slug (a title-derived kebab name).
  • The client assembles the union, not the server. The fedwiki client ("the lineup") renders pages side by side: clicking a link opens that page from whatever site it resolves against, appended to the right. Browsing literally builds a left-to-right trail across sites.
  • Neighborhood = the dynamic set of sites encountered in the current session (from the sites of pages you've opened, links, and forks). Search runs across the neighborhood — a federated search over exactly the sites you've touched.
  • Roster = an explicit, authored list of sites to include (a curated neighborhood); "sister sites" are peers you watch. There is no central registry — discovery is by link, fork, and roster.
  • Happenings = time-bounded collaborative events where many participants fork around a topic for a period, producing a burst of related forks (a bounded collaboration that leaves a durable forked record on each participant's own site).

4. The editorial model — fork, don't edit-in-place

You can only write to your own site. To change someone else's page you fork it (copy into your site, journal records the source), then edit your copy. Many forks of the same page coexist across sites — Cunningham's "chorus of voices": no canonical version, divergence is normal and visible, and you choose whose changes to pull by forking them. There is no automatic merge — reconciliation is human: compare journals, fork the version you prefer, optionally re-fork upstream changes.


5. Capability profile

Dimension (synthesis spectrum) Federated Wiki
Attachment mode REST/file-store hybrid — page JSON over HTTP+CORS; also static files
Addressing granularity story item (paragraph) via stable 16-hex id
Content identity item id random+stable; page id = site + slug
Identity vs placement placement-bound: identity = site + slug; forks are new identities linked by journal provenance
Structure ordered array of typed items (plugin-typed)
History per-page append-only journal of semantic actions (op-log)
Merge model fork + manual journal compare — a third model beside git 3-way and CRDT auto-merge
Native query none built-in; neighborhood search (federated full-text across touched sites)
Translation item text is wiki/Markdown-ish; plugins own their formats
Attachment/write granularity story-item level (add/edit/move/remove one item)
Operational envelope tiny servers, browser-driven; CORS is the whole API surface
Access grant own-site-only writes; reads open via CORS
Content opacity transparent JSON (no E2EE); plugin-typed but inspectable
Provenance first-classfork records source site; journal = provenance DAG

6. INTENT mapping

Reinforcements (fedwiki validates our pillars)

  • Coordination journal (INTENT) ≈ fedwiki journal. Our journal idea is exactly fedwiki's per-page append-only action log — and fedwiki proves the story-as-derived-view pattern works. Strong reinforcement; adopt the semantic-op + provenance-entry shape.
  • Overlay before mutationfork. Fork is the canonical overlay: a non-destructive copy onto a writable surface, recording provenance, before any change.
  • Union without erasureneighborhood + chorus. The union is assembled from many sovereign sites; provenance (which site, forked-from) is never hidden; divergence is surfaced, not resolved away.
  • No silent remote mutationown-site-only writes. You structurally cannot mutate a remote; you fork to your own site. This is our rule, enforced by architecture.
  • Mechanism over policyno canonical source. Fedwiki ships the mechanism (fork, journal, neighborhood) and leaves "which version wins" entirely to people.
  • Graceful degradation ≈ static-file sites — a fedwiki site can be a read-only pile of JSON files; still forkable, still in the neighborhood.

Divergences (boundaries / design notes, not bugs)

  • Identity = placement. Fedwiki page identity is site + slug; a fork is a new page whose only tie to the origin is a journal fork entry. shard-wiki wants identity ≠ placement (the "same" page across shards under a stable identity, T16) — so we treat fedwiki's journal-linked forks as provenance edges, and layer our own cross-shard identity over them rather than adopting slug-as-identity.
  • No query / no typed-record model. Fedwiki is paragraphs+plugins, not a typed DB (contrast Notion/Wikibase). Fine — it sits at the coordination end, not the structure end. We don't ask fedwiki to provide query; the neighborhood search is the model for federated search across shards (T-federation), not in-shard query.
  • Browser-assembles-union. Fedwiki pushes union assembly to the client. shard-wiki assembles server/orchestrator-side. Adopt the model (union from sovereign sources + provenance), not the client-only locus.

What to keep

  1. Journal = append-only semantic-op log with provenance entries, story = derived replay view. This is the concrete shape for our coordination journal (T13).
  2. Fork-with-source-attribution as the overlay/adopt primitive across shards.
  3. Neighborhood as the model for a dynamic, link-and-fork-discovered federated set + search, with roster as the curated/explicit variant.
  4. Chorus of forks — represent divergent versions across shards as co-equal, linked by provenance, with reconciliation as an explicit human/policy step (mechanism over policy).

7. UC seeds

# Seed Disposition
UC-70 Attach a Federated Wiki site as a shard via its page JSON + CORS (REST/file-store hybrid); project pages, fork to overlay new
UC-71 Adopt a per-page append-only semantic-action journal with provenance entries (fork=source site) as the coordination-journal model — replay to materialize, compare to locate divergence new
UC-72 Fork-with-site-provenance federation across a neighborhood of peer shards — assemble a union from links/forks, search across it, preserve the chorus without forcing a canonical new
fork-with-provenance as overlay/adopt enrich UC-26 (fork)
carry-forward of forked content + upstream re-fork enrich UC-28 (carry-forward)
happenings = time-bounded collaboration leaving durable forks enrich UC-30 (time-bounded space)
union/chorus of co-equal versions, provenance-linked enrich UC-05 / UC-27

8. Architecture notes for SHARD-WP-0002

  • T1T5 (federation): fedwiki is the reference design. The journal (append-only, semantic ops, fork-provenance) is the concrete coordination-journal shape; **neighborhood
    • roster** is the discovery/membership model (dynamic vs curated); fork is the overlay/adopt op. Model the union as an assembly over sovereign sources with provenance edges, reconciliation left to policy.
  • T11 (capability/write-granularity): add story-item / paragraph as a named write-granularity tier between whole-file and block/character.
  • T13 (history portability / merge model): record fedwiki's journal-replay op-log as a third merge model beside git 3-way and CRDT auto-merge — a coarse semantic op-log applied manually via fork. A shard whose history is such a journal can supply our coordination journal almost directly (vs git-commit import or CRDT-update import).
  • T16 (identity ≠ placement): fedwiki's fork journal entries are provenance edges between same-named pages on different sites — exactly the cross-shard "same page, different placement" relation we must model. Use them as edges; keep our own identity layer above slug.

9. Open questions

  1. Should shard-wiki's coordination journal adopt fedwiki's exact action vocabulary (create/add/edit/move/remove/fork) at the page-item level, or a more granular/abstract op set that other shards can also emit?
  2. Is neighborhood (dynamic, link/fork-discovered) a first-class membership mode for an information space, or only a view over an explicitly-configured shard set (roster)?
  3. How do we reconcile fedwiki's slug-as-identity + fork-DAG with our intended stable cross-shard identity (T16) — promote fork edges into the identity graph, or keep them as provenance-only annotations?
  4. Does the chorus / no-canonical stance compose with shards that do assert a canonical (Notion, an upstream git main)? (policy-selectable canonical over a mechanism that permits chorus.)

10. Sources

  • Smallest Federated Wiki wiki: Story JSON, Federation Details — github.com/WardCunningham/Smallest-Federated-Wiki/wiki
  • JSON Schema notes — song.fed.wiki.org/json-schema.html
  • "Smallest Federated Wiki" — home.c2.com/smallest-federated-wiki.html
  • Federated Wiki — federated.wiki (Visualizing Page History)
  • Mike Caulfield, "The OER Case for Federated Wiki" — hapgood.us (2015)
  • Jon Udell, "A federated Wikipedia" — blog.jonudell.net (2015)
  • Wikipedia: Federated Wiki; IndieWeb: Smallest Federated Wiki
  • fedwiki/wiki-plugin-transport (plugin/transport reference)
  • prior: research/260608-federation-concepts/ §3

11. Traceability

New UCs UC-70UC-72 carry the marker in the wikiengines column of spec/UseCaseCatalog.md (true lineage = this dive; placed in the nearest existing column). Enriched: UC-26, UC-28, UC-30, UC-05, UC-27. Architecture cross-refs: SHARD-WP-0002 T1T5, T11, T13, T16.