generated from coulomb/repo-seed
Compare commits
140 Commits
9acd4e2841
...
main
| Author | SHA1 | Date | |
|---|---|---|---|
| 776f5af5a7 | |||
| fd961c83b4 | |||
| cca5bf83c3 | |||
| def699c1eb | |||
| a4e0f52ec1 | |||
| 4231daf94f | |||
| 37681d89b6 | |||
| a8e65235a8 | |||
| d7d046cac0 | |||
| 0b3ab2086f | |||
| d85d019543 | |||
| 3a5acdcb28 | |||
| 34b0c539f3 | |||
| da540d4eea | |||
| 951b24300d | |||
| c731c96634 | |||
| f0fee65cc0 | |||
| 34432c2e15 | |||
| 45a858ead0 | |||
| b31e9bc337 | |||
| e50dcc6b5c | |||
| a165cced33 | |||
| 8393a9c55d | |||
| ff96ee0c48 | |||
| 8b353f1077 | |||
| b9bb1f7d10 | |||
| c40fa3c934 | |||
| 54c2bf2ae5 | |||
| 6d8bd837a4 | |||
| b48a99d3c2 | |||
| 9b7f86ba69 | |||
| 74142096d0 | |||
| 2100e956aa | |||
| e62560eb5a | |||
| b147d3e831 | |||
| cdcf4b09aa | |||
| b21efe307b | |||
| e18397272a | |||
| 0ee972f2e2 | |||
| bb1b54e0af | |||
| b70f1c9acc | |||
| 8de044bbde | |||
| 67d851be0b | |||
| 46ec6f2a5f | |||
| ad4a2dbf5a | |||
| 23bc597343 | |||
| ec7a1ec946 | |||
| d6d7cc555f | |||
| 1dd1def40f | |||
| 6d341cd4e6 | |||
| 1b3d4aaa39 | |||
| 6e49bd3a4b | |||
| d2c73b02d9 | |||
| 042e198286 | |||
| 3ea0cc1226 | |||
| 4be2f190a0 | |||
| 7d00ae758e | |||
| 715ab1ca00 | |||
| d797bc5ee4 | |||
| 92d5774baf | |||
| e24f0034a0 | |||
| b2ea276c00 | |||
| 9fedb31d8b | |||
| 517cf1d282 | |||
| b44b2a74a4 | |||
| 24108b65aa | |||
| 6d48a1d3e6 | |||
| 9a4e00a05a | |||
| 5a77ea879c | |||
| aca9bf30f9 | |||
| 3b4f10a349 | |||
| f2dd2e124a | |||
| 1f3aba7aad | |||
| d65f9e21f3 | |||
| 802a80231a | |||
| 40575045ca | |||
| a2f15806f5 | |||
| 9b5fceb451 | |||
| d39e7f7765 | |||
| 1ad70a9c8a | |||
| 3a753a6f3b | |||
| 08a2148079 | |||
| cbd29e0a32 | |||
| 079652b61f | |||
| 84ab25eeb9 | |||
| 229d24d1cf | |||
| f21b7b5259 | |||
| c895d33091 | |||
| 59c36ac9d1 | |||
| 012d151fe8 | |||
| af840576e9 | |||
| 0383d78440 | |||
| dc451b0f4e | |||
| 04be66161e | |||
| c8dbe9b573 | |||
| b5d2cbc330 | |||
| ee2449d987 | |||
| f1dc7aff61 | |||
| dd812abb81 | |||
| 9b5b393519 | |||
| 6a88eb7710 | |||
| d898307b7e | |||
| 90a854f841 | |||
| 3d137c96b6 | |||
| f2f9f31df8 | |||
| 2c978d71f0 | |||
| b676579407 | |||
| 7fb90219b5 | |||
| 56b6cdd110 | |||
| 4acadacfee | |||
| 1546ca09bf | |||
| a3ca3e975b | |||
| 25091dbd2e | |||
| 1e3aabc143 | |||
| 25a714efa7 | |||
| 3eb026139f | |||
| 5ba6975573 | |||
| d4afce3699 | |||
| 2397c2c4b3 | |||
| a7084ec74b | |||
| 700566b1e2 | |||
| af5841a0d7 | |||
| 8941e131e8 | |||
| ce4b0cb9e4 | |||
| 6878a0c184 | |||
| 87f30e7c62 | |||
| e0c3d0699c | |||
| 1e962a0aef | |||
| 54468b60bc | |||
| 9f0fc57428 | |||
| ee805d5af7 | |||
| 188643b0f5 | |||
| fee47514a2 | |||
| 1281ca689b | |||
| dbc42f05db | |||
| 31ec8ab205 | |||
| a6629bdb29 | |||
| ecb7d0154e | |||
| d955fa239c | |||
| 036dbad816 |
20
.claude/rules/agents.md
Normal file
20
.claude/rules/agents.md
Normal file
@@ -0,0 +1,20 @@
|
||||
## Kaizen Agents
|
||||
|
||||
Specialized agent personas available on demand via the state-hub MCP.
|
||||
|
||||
**Discover:** `list_kaizen_agents()` — returns all agents with name, description, category
|
||||
**Load:** `get_kaizen_agent("tdd-workflow")` — returns full instructions; read and follow them
|
||||
|
||||
Common agents:
|
||||
|
||||
| Agent | Category | When to use |
|
||||
|-------|----------|-------------|
|
||||
| `tdd-workflow` | testing | Step-by-step TDD8 workflow for any feature |
|
||||
| `code-refactoring` | quality | Code quality analysis and safe refactoring |
|
||||
| `test-maintenance` | testing | Diagnose and fix failing tests |
|
||||
| `requirements-engineering` | process | Prevent interface/mock mismatches upfront |
|
||||
| `keepaTodofile` | process | Maintain TODO.md during work |
|
||||
| `project-management` | process | Track status, determine next steps |
|
||||
| `datamodel-optimization` | quality | Optimize dataclasses and data structures |
|
||||
|
||||
All 17 agents: call `list_kaizen_agents()` for the full list.
|
||||
8
.claude/rules/architecture.md
Normal file
8
.claude/rules/architecture.md
Normal file
@@ -0,0 +1,8 @@
|
||||
## Architecture
|
||||
|
||||
<!-- TODO: Describe the key design decisions and component structure.
|
||||
Key modules, data flows, external integrations, state machines, etc. -->
|
||||
|
||||
## Quick Reference
|
||||
|
||||
`~/state-hub/mcp_server/TOOLS.md` — MCP tool reference
|
||||
50
.claude/rules/credential-routing.md
Normal file
50
.claude/rules/credential-routing.md
Normal file
@@ -0,0 +1,50 @@
|
||||
# Credential and access routing
|
||||
|
||||
**Audience:** Codex, Claude Code, Grok, and custodian agents that call **llm-connect**
|
||||
for inference. Run this check **before** requesting secrets, API keys, SSH access,
|
||||
login tokens, or database passwords — in any repo, not only `ops-warden`.
|
||||
|
||||
ops-warden **issues SSH certificates only** (`warden sign`, `cert_command`). Every
|
||||
other credential need belongs to another subsystem. **Do not** message
|
||||
`ops-warden` on State Hub expecting a secret value; the reply is a pointer, not a key.
|
||||
|
||||
### Lookup (do this first)
|
||||
|
||||
```bash
|
||||
warden route find "<describe your need>" --json
|
||||
warden route show <catalog-id> --json
|
||||
```
|
||||
|
||||
Requires the `warden` CLI from `~/ops-warden` (`uv tool install .` or `uv run warden`).
|
||||
|
||||
| Agent runtime | How to orient |
|
||||
| --- | --- |
|
||||
| **Codex / Grok** (shell, HTTP State Hub) | `warden route` commands above; inbox `to_agent=shard-wiki` is for coordination, not secret vending |
|
||||
| **Claude Code** (MCP when available) | `get_domain_summary("custodian")` for workstreams; **still** use `warden route` for credential ownership |
|
||||
| **llm-connect** (inference service) | Never put secret retrieval in prompts; route custody to OpenBao/operator paths surfaced by `warden route` |
|
||||
|
||||
### Quick routing table
|
||||
|
||||
| I need… | Owner | ops-warden executes? |
|
||||
| --- | --- | --- |
|
||||
| SSH cert (`adm`/`agt`/`atm`) | ops-warden | **Yes** — `warden sign` |
|
||||
| API key, DB password, provider token | OpenBao (`railiance-platform`) | No — route only |
|
||||
| Login / OIDC / MFA | key-cape / Keycloak | No — route only |
|
||||
| Authorization decision | flex-auth | No — route only |
|
||||
| activity-core → issue-core emission | activity-core + issue-core | No — `warden route show activity-core-issue-sink` |
|
||||
| SSH tunnel | ops-bridge (+ `cert_command` from warden) | No — route only |
|
||||
|
||||
### Anti-patterns (do not do these)
|
||||
|
||||
- `POST /messages/` to `ops-warden` asking for `ISSUE_CORE_API_KEY`, `OPENROUTER_API_KEY`, etc.
|
||||
- Inventing `warden secret`, `warden login`, `warden bao`, `warden tunnel` — they do not exist
|
||||
- Pasting secrets into Git, State Hub, workplans, logs, or chat
|
||||
|
||||
### Other capabilities (reuse-surface)
|
||||
|
||||
Non-credential capabilities are usually discovered through **reuse-surface** federation
|
||||
(`reuse-surface` registry / `capability.*` indexes). Credential routing is inlined in
|
||||
every repo's agent instructions because it is high-frequency, high-risk, and easy to
|
||||
get wrong.
|
||||
|
||||
**Canon:** `~/ops-warden/wiki/CredentialRouting.md` · catalog `~/ops-warden/registry/routing/catalog.yaml`
|
||||
38
.claude/rules/first-session.md
Normal file
38
.claude/rules/first-session.md
Normal file
@@ -0,0 +1,38 @@
|
||||
## First Session Protocol
|
||||
|
||||
Triggered when `get_domain_summary("consumer")` shows **no workstreams**.
|
||||
The project is registered but work has not yet been structured.
|
||||
|
||||
**Step 1 — Read, don't write**
|
||||
- `~/the-custodian/canon/projects/consumer/project_charter_v0.1.md` — purpose, scope
|
||||
- `~/the-custodian/canon/projects/consumer/roadmap_v0.1.md` — planned phases
|
||||
- Scan repo root: README, directory structure, existing code or docs
|
||||
|
||||
**Step 2 — Survey in-progress work**
|
||||
Look for TODOs, open branches, half-finished files. Note done vs. started but incomplete.
|
||||
|
||||
**Step 3 — Propose workstreams to Bernd**
|
||||
Propose 1–3 workstreams — each a coherent strand, weeks to months, anchored to a
|
||||
roadmap phase. **Wait for approval before creating.**
|
||||
|
||||
**Step 4 — Create workplan file first, then DB record (ADR-001)**
|
||||
```
|
||||
workplans/SHARD-WP-NNNN-<slug>.md ← write this first
|
||||
```
|
||||
Then register in the hub:
|
||||
```
|
||||
create_workstream(topic_id="4c2e5315-2cb9-447c-9d16-a39bdb0aabd0", title="...", owner="...", description="...")
|
||||
create_task(workstream_id="<id>", title="...", priority="high|medium|low")
|
||||
```
|
||||
|
||||
**Step 5 — Record the setup**
|
||||
```
|
||||
add_progress_event(
|
||||
summary="First session: structured consumer into N workstreams, M tasks",
|
||||
event_type="milestone",
|
||||
topic_id="4c2e5315-2cb9-447c-9d16-a39bdb0aabd0",
|
||||
detail={"workstreams": [...], "tasks_created": M}
|
||||
)
|
||||
```
|
||||
|
||||
<!-- Delete or archive this file once past first session -->
|
||||
8
.claude/rules/repo-boundary.md
Normal file
8
.claude/rules/repo-boundary.md
Normal file
@@ -0,0 +1,8 @@
|
||||
## Repo boundary
|
||||
|
||||
This repo owns **shard-wiki** only. It does not own:
|
||||
|
||||
<!-- TODO: List what belongs in adjacent repos, e.g.:
|
||||
- SSH key management → railiance-infra/
|
||||
- State hub code → state-hub/
|
||||
-->
|
||||
5
.claude/rules/repo-identity.md
Normal file
5
.claude/rules/repo-identity.md
Normal file
@@ -0,0 +1,5 @@
|
||||
**Purpose:** Git-based Markdown wiki orchestrator and federation layer. Python (src/ layout, hatchling, pytest). Early-stage: scaffold + INTENT.md defined, domain model not yet implemented. See INTENT.md for authoritative scope.
|
||||
|
||||
**Domain:** consumer
|
||||
**Repo slug:** shard-wiki
|
||||
**Topic ID:** 4c2e5315-2cb9-447c-9d16-a39bdb0aabd0
|
||||
85
.claude/rules/session-protocol.md
Normal file
85
.claude/rules/session-protocol.md
Normal file
@@ -0,0 +1,85 @@
|
||||
## Session Protocol
|
||||
|
||||
Dev Hub (State Hub API): http://127.0.0.1:8000
|
||||
MCP server name in `~/.claude.json`: `dev-hub`
|
||||
|
||||
**Step 1 — Orient**
|
||||
|
||||
Read the offline-safe brief first — it works without a live hub connection:
|
||||
```bash
|
||||
cat .custodian-brief.md
|
||||
```
|
||||
Then call the MCP tool for richer cross-domain context when MCP tools are exposed:
|
||||
```
|
||||
get_domain_summary("consumer")
|
||||
```
|
||||
If MCP tools are unavailable in the current agent session, use the REST API:
|
||||
```bash
|
||||
curl -s "http://127.0.0.1:8000/state/summary" | python3 -m json.tool
|
||||
```
|
||||
If the hub is offline: `cd ~/state-hub && make api`
|
||||
|
||||
**Step 2 — Check inbox**
|
||||
With MCP tools:
|
||||
```
|
||||
get_messages(to_agent="shard-wiki", unread_only=True)
|
||||
```
|
||||
Mark read with `mark_message_read(message_id)`. Reply or act on coordination
|
||||
requests before proceeding.
|
||||
|
||||
Without MCP tools:
|
||||
```bash
|
||||
curl -s "http://127.0.0.1:8000/messages/?to_agent=shard-wiki&unread_only=true" \
|
||||
| python3 -m json.tool
|
||||
curl -s -X PATCH "http://127.0.0.1:8000/messages/<id>/read" \
|
||||
-H "Content-Type: application/json" -d '{}'
|
||||
```
|
||||
|
||||
**Step 3 — Scan workplans**
|
||||
```bash
|
||||
ls workplans/
|
||||
```
|
||||
For each file with `status: ready`, `active`, or `blocked`, note pending
|
||||
`wait`/`todo`/`progress` tasks.
|
||||
|
||||
**Step 4 — Present brief**
|
||||
|
||||
1. **Active workstreams** for `consumer` — title, task counts, blocking decisions
|
||||
2. **Pending tasks** from `workplans/` + any `[repo:shard-wiki]` hub tasks
|
||||
3. **Goal guidance** — if `goal_guidance` in summary:
|
||||
- `needs_workplan`: surface as top action — *"Repo goal '{title}' has no workplan yet"*
|
||||
- `alignment_warnings`: flag if active work is not aligned with current goal
|
||||
4. **Suggested next action** — highest-priority open item
|
||||
5. **SBOM status** — flag if `last_sbom_at` is unset for this repo
|
||||
|
||||
If no workstreams: follow First Session Protocol (`first-session.md`).
|
||||
|
||||
**During work:** `record_decision()` · `add_progress_event()` · `resolve_decision()`
|
||||
|
||||
> State Hub is a *read model*. Bootstrap tools (`create_workstream`, `create_task`)
|
||||
> are First Session Protocol only. Work structure belongs in repo files (ADR-001).
|
||||
|
||||
**Session close:**
|
||||
With MCP tools:
|
||||
```
|
||||
add_progress_event(summary="...", topic_id="4c2e5315-2cb9-447c-9d16-a39bdb0aabd0", workstream_id="<uuid>")
|
||||
```
|
||||
Without MCP tools:
|
||||
```bash
|
||||
curl -s -X POST http://127.0.0.1:8000/progress/ \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{"topic_id":"4c2e5315-2cb9-447c-9d16-a39bdb0aabd0","workstream_id":"<uuid>","event_type":"note","summary":"what changed","author":"codex"}'
|
||||
```
|
||||
If workplan files were modified, ensure the local copy is up to date first:
|
||||
```bash
|
||||
git -C <repo_path> pull --ff-only
|
||||
cd ~/state-hub && make fix-consistency REPO=shard-wiki
|
||||
```
|
||||
For repos where implementation runs on a remote machine (e.g. CoulombCore),
|
||||
use the combined target which pulls before fixing:
|
||||
```bash
|
||||
cd ~/state-hub && make fix-consistency-remote REPO=shard-wiki
|
||||
```
|
||||
**C-15** (DB task ahead of file) is normal in multi-machine workflows — writeback
|
||||
will sync the file to match DB. **C-16** (repo behind remote) blocks all writes
|
||||
until you pull — intentional to prevent clobbering remote progress.
|
||||
19
.claude/rules/stack-and-commands.md
Normal file
19
.claude/rules/stack-and-commands.md
Normal file
@@ -0,0 +1,19 @@
|
||||
## Stack
|
||||
|
||||
<!-- TODO: Fill in language, frameworks, and key dependencies -->
|
||||
- **Language:**
|
||||
- **Key deps:**
|
||||
|
||||
## Dev Commands
|
||||
|
||||
```bash
|
||||
# TODO: Fill in the standard commands for this repo
|
||||
|
||||
# Install dependencies
|
||||
|
||||
# Run tests
|
||||
|
||||
# Lint / type check
|
||||
|
||||
# Build / package (if applicable)
|
||||
```
|
||||
40
.claude/rules/workplan-convention.md
Normal file
40
.claude/rules/workplan-convention.md
Normal file
@@ -0,0 +1,40 @@
|
||||
## Workplan Convention (ADR-001)
|
||||
|
||||
File location: `workplans/SHARD-WP-NNNN-<slug>.md`
|
||||
ID prefix: `SHARD-WP-`
|
||||
|
||||
Work items originate as files in this repo **before** being registered in the hub.
|
||||
|
||||
Canonical workplan/workstream frontmatter statuses are:
|
||||
`proposed`, `ready`, `active`, `blocked`, `backlog`, `finished`, `archived`.
|
||||
Use `proposed` for a newly drafted plan, `ready` after review against current
|
||||
repo state, and `finished` when implementation is complete. `stalled` and
|
||||
`needs_review` are derived health labels, not stored statuses.
|
||||
|
||||
Closed workplans may be moved to `workplans/archived/` with a completion-date
|
||||
prefix: `YYMMDD-SHARD-WP-NNNN-<slug>.md`. The frontmatter id remains
|
||||
unchanged; the prefix is only for quick visual reference.
|
||||
|
||||
Small opportunistic tasks discovered during another session use **Ad Hoc Tasks**:
|
||||
`workplans/ADHOC-YYYY-MM-DD.md`, workstream slug `adhoc-YYYY-MM-DD`, and task ids
|
||||
`ADHOC-YYYY-MM-DD-T01`, `T02`, etc. Use adhocs only for low-risk work completed
|
||||
directly. Promote anything requiring analysis, design, approval, dependencies, or
|
||||
multiple planned phases into a normal workplan.
|
||||
|
||||
Ecosystem todos from other agents arrive as `[repo:shard-wiki]` hub tasks —
|
||||
visible at session start. Pick one up by creating the workplan file, then registering
|
||||
the workstream.
|
||||
|
||||
Task blocks use this shape:
|
||||
|
||||
```task
|
||||
id: SHARD-WP-NNNN-T01
|
||||
status: wait | todo | progress | done | cancel
|
||||
priority: high | medium | low
|
||||
state_hub_task_id: "<uuid>" # written by fix-consistency — do not edit
|
||||
```
|
||||
|
||||
Status progression is `todo` → `progress` → `done`; use `wait` for waiting or
|
||||
blocked work and `cancel` for stopped work.
|
||||
|
||||
<!-- Ralph Loop rules and HEUREKA sequence: ~/.claude/CLAUDE.md — do not duplicate here -->
|
||||
@@ -2,60 +2,46 @@
|
||||
# Custodian Brief — shard-wiki
|
||||
|
||||
**Domain:** whynot
|
||||
**Last synced:** 2026-06-14 16:28 UTC
|
||||
**Last synced:** 2026-06-15 22:57 UTC
|
||||
**State Hub:** http://127.0.0.1:8000 *(adjust if running on a remote machine)*
|
||||
|
||||
## Active Workstreams
|
||||
|
||||
### computational / interactive-knowledge systems research
|
||||
Progress: 0/8 done | workstream_id: `5fc3911a-1c68-4826-bbd2-b892dec8f981`
|
||||
### second adapter — git-IS-store shard (contract validation on a new substrate)
|
||||
Progress: 0/3 done | workstream_id: `9e24eeb0-c0f0-41e6-a1ca-88d71e4139ea`
|
||||
|
||||
**Open tasks:**
|
||||
- · Literate Programming — Knuth's WEB / weave / tangle `c252fccc`
|
||||
- · Mathematica Notebooks `6e46ec45`
|
||||
- · Jupyter Notebooks `3d35d58f`
|
||||
- · Processing.org and Processing.js `0f54cb0e`
|
||||
- · Strudel.cc — live-coding REPL `2b099639`
|
||||
- · Squeak Smalltalk `50723598`
|
||||
- · Glamorous Toolkit — moldable development `9e11efac`
|
||||
- … and 1 more open tasks
|
||||
- · GitShardAdapter — read over a git working tree/repo `8a1c7c80`
|
||||
- · Write = commit; current_rev = sha (drift) `b47dfb86`
|
||||
- · History adopt + integration with union/overlay `4c895f42`
|
||||
|
||||
### wiki-engine deep-dive batch (new-insight + git-forge + classic engines)
|
||||
Progress: 0/9 done | workstream_id: `56af8185-420f-4a56-9152-5c64fda7a3f3`
|
||||
### incremental union maintenance + equivalence index + I-2 verification
|
||||
Progress: 0/4 done | workstream_id: `78d48bcf-6482-4266-bc81-084b7ec1cd80`
|
||||
|
||||
**Open tasks:**
|
||||
- · Federated Wiki — federation model (fork / journal / happenings) `347fa26d`
|
||||
- · Wikibase / Wikidata — RDF entity-statement knowledge graph `23272f05`
|
||||
- · TiddlyWiki — the self-contained single-file wiki `a0ab1aba`
|
||||
- · ikiwiki — compile-to-static, git-backed, distributed peer wikis `5748d246`
|
||||
- · Gitea / GitLab / GitHub wikis — git-forge-hosted Markdown wikis `9719ce23`
|
||||
- · Salesforce Quip — collaborative live docs + embedded spreadsheets `3e55128a`
|
||||
- · Oddmuse — minimalist single-script Perl wiki `581e0a96`
|
||||
- … and 2 more open tasks
|
||||
- · Equivalence index: blocking + verify `842f480b`
|
||||
- · Incremental maintenance (delta, not additive) `2da4e0b8`
|
||||
- · I-2 verification: digest + consistency-checker `b602ce31`
|
||||
- · Wire incremental tier behind resolution + views `2f3d083c`
|
||||
|
||||
### federation architecture design
|
||||
Progress: 0/16 done | workstream_id: `2af4c46d-cbfd-40ea-a94b-d9e60b0f9945`
|
||||
### derived views — wikilinks, BackLinks, RecentChanges, AllPages/SiteMap
|
||||
Progress: 0/5 done | workstream_id: `2fe15330-ddf6-4b0f-8e55-ada341375d35`
|
||||
|
||||
**Open tasks:**
|
||||
- · Architecture positioning and boundaries `ea8fdb22`
|
||||
- · Remix primitives: fork, overlay, import, reference `fb7d4bce`
|
||||
- · Equivalent page identity and multi-version presentation `8f2a333d`
|
||||
- · History, attribution, and coordination journal `5f39f48d`
|
||||
- · Union composition layer `3ff71e11`
|
||||
- · Change notification and subscription transports `9596e5e8`
|
||||
- · Information space lifecycle `38134064`
|
||||
- … and 9 more open tasks
|
||||
- · Wikilink + red-link model `792660c3`
|
||||
- · BackLinks (core) `431a54c3`
|
||||
- · RecentChanges (core) `270c1c31`
|
||||
- · AllPages / SiteMap (core) `898ba43e`
|
||||
- · Wiring + integration `7157544b`
|
||||
|
||||
### shard-wiki requirements from yawex prior art
|
||||
Progress: 0/6 done | workstream_id: `0ed023a2-760b-4990-b931-8ee1f41ea08f`
|
||||
### git-backed DecisionLog + per-space append authority
|
||||
Progress: 0/4 done | workstream_id: `4fb5b29b-955c-4f37-85cf-58b4643ab1ca`
|
||||
|
||||
**Open tasks:**
|
||||
- · Design federation page-resolution model (yawex state space as inspiration) `ebc036e4`
|
||||
- · Define namespace/path model and page+shard roles `431b4d28`
|
||||
- · Specify union-level derived views (BackLinks, RecentChanges, AllPages, SiteMap, Search) `564545ec`
|
||||
- · Provenance & freshness model for pages/revisions/projections `738326f5`
|
||||
- · Overlay / lightweight-patch model (from yawex append/comment) `a268de6a`
|
||||
- · Markdown link semantics: wikilink + red-link extension `a7499f3e`
|
||||
- · Git event-store backend (append = commit/object) `a8fcbb3e`
|
||||
- · Per-space append authority (lease) `62abd162`
|
||||
- · Fold over the git log + read-your-writes across processes `8cc3691e`
|
||||
- · Migration + wiring `281e1db4`
|
||||
|
||||
---
|
||||
## MCP Orientation (when available)
|
||||
|
||||
17
.repo-classification.yaml
Normal file
17
.repo-classification.yaml
Normal file
@@ -0,0 +1,17 @@
|
||||
repo_classification:
|
||||
standard: Repo Classification Standard
|
||||
version: '1.0'
|
||||
classified_at: '2026-06-22'
|
||||
classified_by: agent
|
||||
category: project
|
||||
domain: consumer
|
||||
secondary_domains: []
|
||||
capability_tags:
|
||||
- knowledge
|
||||
- documentation
|
||||
business_stake:
|
||||
- product
|
||||
- experience
|
||||
business_mechanics:
|
||||
- coordination
|
||||
- operation
|
||||
243
AGENTS.md
243
AGENTS.md
@@ -1,62 +1,219 @@
|
||||
# AGENTS.md
|
||||
# shard-wiki — Agent Instructions
|
||||
|
||||
Guidance for agents working in `shard-wiki`.
|
||||
## Repo Identity
|
||||
|
||||
## Read First
|
||||
**Purpose:** Git-based Markdown wiki orchestrator and federation layer. Python (src/ layout, hatchling, pytest). Early-stage: scaffold + INTENT.md defined, domain model not yet implemented. See INTENT.md for authoritative scope.
|
||||
|
||||
1. `INTENT.md` — aspiration and boundaries (stable; architectural changes are rare).
|
||||
2. `SCOPE.md` — what we are achieving now and current maturity.
|
||||
3. `.custodian-brief.md` — State Hub snapshot (generated; do not edit manually).
|
||||
**Domain:** consumer
|
||||
**Repo slug:** shard-wiki
|
||||
**Topic ID:** `4c2e5315-2cb9-447c-9d16-a39bdb0aabd0`
|
||||
**Workplan prefix:** `SHARD-WP-`
|
||||
|
||||
## Documentation Layout
|
||||
---
|
||||
|
||||
This repo follows the CoulombSocial / HelixForge / MarkiTect documentation
|
||||
layout (recommendation, not strict law). Efficient retrieval by purpose:
|
||||
## State Hub Integration
|
||||
|
||||
| Path | Purpose |
|
||||
|------|---------|
|
||||
| `INTENT.md` | Aspiration and boundaries |
|
||||
| `SCOPE.md` | Top-level view of current achievement; closes gap to INTENT |
|
||||
| `research/` | Exploration results (`yymmdd-` prefix on files or subdirs) |
|
||||
| `demand/` | Inbound requests not yet reviewed into spec or workplans |
|
||||
| `spec/` | Implementation guardrails (PRD, TSD, use cases, architecture) |
|
||||
| `workplans/` | State Hub–registered implementation tasks |
|
||||
| `docs/` | Stakeholder documentation (users, developers, humans, agents) |
|
||||
| `wiki/` | Perspective-free interconnected knowledge (wiki UI when connected) |
|
||||
| `issues/` | Mirror of relevant open tickets when ticket systems are in use |
|
||||
| `history/` | Archived material (`yymmdd-` prefix); out of scope for daily work |
|
||||
The Custodian State Hub tracks work across all domains. Interact via HTTP REST —
|
||||
there is no MCP server for Codex agents.
|
||||
|
||||
**Mode of operation:** close SCOPE → INTENT while learning; refine both as needed.
|
||||
| Context | URL |
|
||||
|---------|-----|
|
||||
| Local workstation | `http://127.0.0.1:8000` |
|
||||
| Remote via tunnel | `http://127.0.0.1:18000` |
|
||||
|
||||
## Domain Vocabulary
|
||||
|
||||
Honor terms from `INTENT.md`: shard, root entity, adapter contract, projection,
|
||||
overlay, coordination journal, shard modes. Do not invent parallel vocabulary.
|
||||
|
||||
## Build And Test
|
||||
### Orient at session start
|
||||
|
||||
```bash
|
||||
pip install -e ".[dev]"
|
||||
pytest
|
||||
ruff check
|
||||
ruff format
|
||||
# Offline brief — works without hub connection
|
||||
cat .custodian-brief.md
|
||||
|
||||
# Active workstreams for this domain
|
||||
curl -s "http://127.0.0.1:8000/workstreams/?topic_id=4c2e5315-2cb9-447c-9d16-a39bdb0aabd0&status=active" \
|
||||
| python3 -m json.tool
|
||||
|
||||
# Check inbox
|
||||
curl -s "http://127.0.0.1:8000/messages/?to_agent=shard-wiki&unread_only=true" \
|
||||
| python3 -m json.tool
|
||||
```
|
||||
|
||||
## State Hub
|
||||
Mark a message read:
|
||||
```bash
|
||||
curl -s -X PATCH "http://127.0.0.1:8000/messages/<id>/read" \
|
||||
-H "Content-Type: application/json" -d '{}'
|
||||
```
|
||||
|
||||
Workplans register with State Hub. After workplan changes:
|
||||
### Log progress (required at session close)
|
||||
|
||||
```bash
|
||||
cd ~/state-hub && make fix-consistency REPO=shard-wiki
|
||||
curl -s -X POST http://127.0.0.1:8000/progress/ \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{
|
||||
"summary": "what was done",
|
||||
"event_type": "note",
|
||||
"author": "codex",
|
||||
"workstream_id": "<uuid>",
|
||||
"task_id": "<uuid>"
|
||||
}'
|
||||
```
|
||||
|
||||
Finished or canceled workplans move to `history/` with a `yymmdd-` archive prefix.
|
||||
Omit `workstream_id` / `task_id` when not applicable.
|
||||
|
||||
## Where To Put New Material
|
||||
### Update task status
|
||||
|
||||
- Exploratory analysis → `research/yymmdd-<topic>/`
|
||||
- Raw feature ask or external requirement → `demand/`
|
||||
- Reviewed design ready to guide code → `spec/`
|
||||
- Implementation tasks → `workplans/`
|
||||
- User/dev/agent how-to → `docs/`
|
||||
- Collaborative unstructured notes → `wiki/`
|
||||
```bash
|
||||
curl -s -X PATCH "http://127.0.0.1:8000/tasks/<task_id>" \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{"status": "progress"}'
|
||||
# values: wait | todo | progress | done | cancel
|
||||
```
|
||||
|
||||
### Flag a task for human review
|
||||
|
||||
```bash
|
||||
curl -s -X PATCH "http://127.0.0.1:8000/tasks/<task_id>" \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{"needs_human": true, "intervention_note": "reason"}'
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Session Protocol
|
||||
|
||||
**Start:**
|
||||
1. `cat .custodian-brief.md` — domain goal and open workstreams (offline-safe)
|
||||
2. Check inbox: `GET /messages/?to_agent=shard-wiki&unread_only=true`; mark read
|
||||
3. Scan workplans: `ls workplans/` — note `status: ready`, `active`, or `blocked` files and open tasks
|
||||
4. Check human-needed tasks: `GET /tasks/?needs_human=true`
|
||||
|
||||
**During work:**
|
||||
- Update task statuses in workplan files as tasks progress
|
||||
- Record significant decisions via `POST /decisions/`
|
||||
|
||||
**Close:**
|
||||
1. Update workplan file task statuses to reflect progress
|
||||
2. Log: `POST /progress/` with a summary of what changed
|
||||
3. Note for the custodian operator: after workplan file changes, run from
|
||||
`~/state-hub`:
|
||||
```bash
|
||||
make fix-consistency REPO=shard-wiki
|
||||
```
|
||||
This syncs task status from files into the hub DB.
|
||||
|
||||
---
|
||||
|
||||
## Credential and access routing
|
||||
|
||||
**Audience:** Codex, Claude Code, Grok, and custodian agents that call **llm-connect**
|
||||
for inference. Run this check **before** requesting secrets, API keys, SSH access,
|
||||
login tokens, or database passwords — in any repo, not only `ops-warden`.
|
||||
|
||||
ops-warden **issues SSH certificates only** (`warden sign`, `cert_command`). Every
|
||||
other credential need belongs to another subsystem. **Do not** message
|
||||
`ops-warden` on State Hub expecting a secret value; the reply is a pointer, not a key.
|
||||
|
||||
### Lookup (do this first)
|
||||
|
||||
```bash
|
||||
warden route find "<describe your need>" --json
|
||||
warden route show <catalog-id> --json
|
||||
```
|
||||
|
||||
Requires the `warden` CLI from `~/ops-warden` (`uv tool install .` or `uv run warden`).
|
||||
|
||||
| Agent runtime | How to orient |
|
||||
| --- | --- |
|
||||
| **Codex / Grok** (shell, HTTP State Hub) | `warden route` commands above; inbox `to_agent=shard-wiki` is for coordination, not secret vending |
|
||||
| **Claude Code** (MCP when available) | `get_domain_summary("custodian")` for workstreams; **still** use `warden route` for credential ownership |
|
||||
| **llm-connect** (inference service) | Never put secret retrieval in prompts; route custody to OpenBao/operator paths surfaced by `warden route` |
|
||||
|
||||
### Quick routing table
|
||||
|
||||
| I need… | Owner | ops-warden executes? |
|
||||
| --- | --- | --- |
|
||||
| SSH cert (`adm`/`agt`/`atm`) | ops-warden | **Yes** — `warden sign` |
|
||||
| API key, DB password, provider token | OpenBao (`railiance-platform`) | No — route only |
|
||||
| Login / OIDC / MFA | key-cape / Keycloak | No — route only |
|
||||
| Authorization decision | flex-auth | No — route only |
|
||||
| activity-core → issue-core emission | activity-core + issue-core | No — `warden route show activity-core-issue-sink` |
|
||||
| SSH tunnel | ops-bridge (+ `cert_command` from warden) | No — route only |
|
||||
|
||||
### Anti-patterns (do not do these)
|
||||
|
||||
- `POST /messages/` to `ops-warden` asking for `ISSUE_CORE_API_KEY`, `OPENROUTER_API_KEY`, etc.
|
||||
- Inventing `warden secret`, `warden login`, `warden bao`, `warden tunnel` — they do not exist
|
||||
- Pasting secrets into Git, State Hub, workplans, logs, or chat
|
||||
|
||||
### Other capabilities (reuse-surface)
|
||||
|
||||
Non-credential capabilities are usually discovered through **reuse-surface** federation
|
||||
(`reuse-surface` registry / `capability.*` indexes). Credential routing is inlined in
|
||||
every repo's agent instructions because it is high-frequency, high-risk, and easy to
|
||||
get wrong.
|
||||
|
||||
**Canon:** `~/ops-warden/wiki/CredentialRouting.md` · catalog `~/ops-warden/registry/routing/catalog.yaml`
|
||||
|
||||
<!-- REPO-AGENTS-EXTENSIONS -->
|
||||
<!-- Append repo-specific agent instructions below this marker.
|
||||
The state-hub template sync preserves content after this line. -->
|
||||
|
||||
---
|
||||
|
||||
## Workplan Convention (ADR-001)
|
||||
|
||||
Work items originate as files in this repo — not in the hub. The hub is a
|
||||
read/cache/index layer that rebuilds from files.
|
||||
|
||||
**File location:** `workplans/SHARD-WP-NNNN-<slug>.md`
|
||||
|
||||
**Archived location:** finished workplans may move to
|
||||
`workplans/archived/YYMMDD-SHARD-WP-NNNN-<slug>.md`. The `YYMMDD` prefix is
|
||||
the completion/archive date; the frontmatter `id` does not change.
|
||||
|
||||
**Ad Hoc Tasks:** small opportunistic fixes discovered during a session use
|
||||
`workplans/ADHOC-YYYY-MM-DD.md` with task ids `ADHOC-YYYY-MM-DD-T01`, etc. Use
|
||||
this only for low-risk work completed directly; create a normal workplan for
|
||||
anything needing analysis, design, approval, dependencies, or multiple phases.
|
||||
|
||||
**Frontmatter:**
|
||||
|
||||
```yaml
|
||||
---
|
||||
id: SHARD-WP-NNNN
|
||||
type: workplan
|
||||
title: "..."
|
||||
domain: consumer
|
||||
repo: shard-wiki
|
||||
status: proposed | ready | active | blocked | backlog | finished | archived
|
||||
owner: codex
|
||||
topic_slug: ...
|
||||
created: "YYYY-MM-DD"
|
||||
updated: "YYYY-MM-DD"
|
||||
state_hub_workstream_id: "<uuid>" # written by fix-consistency — do not edit
|
||||
---
|
||||
```
|
||||
|
||||
Use `proposed` for a new draft, `ready` after review against current repo
|
||||
state, and `finished` after implementation. `stalled` and `needs_review` are
|
||||
derived health labels, not frontmatter statuses.
|
||||
|
||||
**Task block format** (one per `##` section):
|
||||
|
||||
```
|
||||
## Task Title
|
||||
|
||||
` ` `task
|
||||
id: SHARD-WP-NNNN-T01
|
||||
status: wait | todo | progress | done | cancel
|
||||
priority: high | medium | low
|
||||
state_hub_task_id: "<uuid>" # written by fix-consistency — do not edit
|
||||
` ` `
|
||||
|
||||
Task description text.
|
||||
```
|
||||
|
||||
Status progression: `todo` → `progress` → `done`; use `wait` for waiting/blocked work and `cancel` for stopped work.
|
||||
|
||||
To create a new workplan:
|
||||
1. Write the file following the format above
|
||||
2. Notify the custodian operator to run `make fix-consistency REPO=shard-wiki`
|
||||
(or send a message to the hub agent via `POST /messages/`)
|
||||
|
||||
63
CLAUDE.md
63
CLAUDE.md
@@ -1,53 +1,12 @@
|
||||
# CLAUDE.md
|
||||
# shard-wiki — Claude Code Instructions
|
||||
|
||||
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
|
||||
|
||||
## Repository status
|
||||
|
||||
This is an **early-stage Python repository**. The package scaffold (`src/shard_wiki/`, `tests/`, `pyproject.toml`) exists with only smoke tests — the domain model is not yet implemented. Read `INTENT.md` (aspiration), `SCOPE.md` (current achievement), and `AGENTS.md` (layout and conventions) before designing anything. Close the gap from SCOPE to INTENT via `research/`, `spec/`, and `workplans/`.
|
||||
|
||||
## What this project is
|
||||
|
||||
`shard-wiki` is a **Git-based Markdown wiki orchestrator and federation layer**, not a wiki engine. It lets multiple heterogeneous wiki-shaped page stores (**shards**) attach to a shared root entity and be presented as a **union of pages**, while preserving each shard's separate storage, provenance, capabilities, and history.
|
||||
|
||||
The core job is orchestration across backends — Git repos, repo subdirectories (`wiki/`), Gitea wikis, local folders, Obsidian vaults, WebDAV/Nextcloud directories, Coulomb spaces — never replacing or homogenizing them.
|
||||
|
||||
## Core domain model (the concepts code must honor)
|
||||
|
||||
These abstractions come from `INTENT.md` and define the architecture. New code should map onto them rather than inventing parallel vocabulary:
|
||||
|
||||
- **Shard** — an independently meaningful page store attached to a root entity. Shards have *sovereignty*: their own backend, capabilities, limits, history, and identity model. Not all shards are Git-native.
|
||||
- **Root entity / information space** — the joined space that shards attach to. Each information space should have a **Git-addressable coordination layer** (history, patches, review, backup, reconciliation) even when individual shards are not Git-native.
|
||||
- **Shard adapter contract** — the versioned interface a backend implements to participate. Adapters are **capability-aware**: the core must model explicitly which operations a shard supports (read, write, diff, merge, lock, version, publish, accept patches) rather than assuming uniformity.
|
||||
- **Wiki page model** — a stable, versioned, Markdown-first but backend-neutral representation of pages, paths, links, metadata, revisions.
|
||||
- **Projection** — a lazy, cache-like local view of remote/external shard content. Prefer lazy projection over eager copying.
|
||||
- **Overlay** — a non-destructive local edit against a remote, read-only, or capability-limited shard, representable as drafts/patches/commits/merge requests *before* destructive application ("overlay before mutation").
|
||||
- **Coordination journal** — the Git-backed record of change flows for an information space.
|
||||
- **Shard modes** — read-only, write-through, mirrored, projected, cached, canonical.
|
||||
|
||||
## Design constraints to enforce in code
|
||||
|
||||
These are hard boundaries from `INTENT.md`; treat violations as design bugs:
|
||||
|
||||
- **Mechanism over policy.** Provide primitives for federation, sync, overlays, patching, conflict detection, projection, reconciliation. Do *not* hard-code one editorial/sync/conflict/canonical-source policy — keep those configurable.
|
||||
- **Union without erasure.** Always preserve provenance: which shard a page came from, its freshness, whether it is cached, whether it has overlays, whether it diverges from an equivalent page elsewhere. Never hide authorship, conflicts, freshness, or backend limitations.
|
||||
- **No silent remote mutation.** Do not mutate remote systems without explicit adapter support and user intent.
|
||||
- **Graceful degradation.** Limited backends must still be usable as read-only/cache/projection/backup/patch targets.
|
||||
- **Not a file-sync daemon.** Synchronization is wiki-page-semantic, not generic file mirroring.
|
||||
|
||||
`INTENT.md` has a "Stability Note": changes that redefine what a shard is, Git's role, how root entities are modeled, or whether this is an orchestrator vs. an engine are **architectural changes** and should be rare and deliberate.
|
||||
|
||||
## Build, test, run
|
||||
|
||||
Python with a `src/` layout, built via hatchling, tested with pytest. Tests run against the source tree directly (`pythonpath = ["src"]` in `pyproject.toml`), so no install/editable step is required to run them.
|
||||
|
||||
```bash
|
||||
pip install -e ".[dev]" # one-time: install dev tooling (pytest, pytest-cov, ruff)
|
||||
pytest # run the full test suite
|
||||
pytest tests/test_package.py::test_version_is_exposed # run a single test
|
||||
pytest --cov # run with coverage
|
||||
ruff check # lint
|
||||
ruff format # format
|
||||
```
|
||||
|
||||
Note: the system `pytest` is 7.4.x; `minversion` in `pyproject.toml` is pinned to `7.0` to match. Bump it if a newer pytest is installed into the dev environment.
|
||||
@SCOPE.md
|
||||
@.claude/rules/repo-identity.md
|
||||
@.claude/rules/session-protocol.md
|
||||
@.claude/rules/first-session.md
|
||||
@.claude/rules/workplan-convention.md
|
||||
@.claude/rules/stack-and-commands.md
|
||||
@.claude/rules/architecture.md
|
||||
@.claude/rules/repo-boundary.md
|
||||
@.claude/rules/credential-routing.md
|
||||
@.claude/rules/agents.md
|
||||
|
||||
10
INTENT.md
10
INTENT.md
@@ -16,6 +16,8 @@ The goal is to allow independently stored and differently implemented wikis, pag
|
||||
|
||||
The repository provides a **shard orchestration layer** for interconnected Markdown and markup-based wiki content.
|
||||
|
||||
Equivalently, shard-wiki can be used as a **headless, API-first wiki engine** — optimized for **integrating heterogeneous data sources** and for **efficient access by agents and automation** — that ships its own native engine as one (canonical-mode) shard among many. There is no bundled UI: presentation and rendering are consumer concerns.
|
||||
|
||||
It allows wiki-like systems to:
|
||||
|
||||
* Attach heterogeneous page stores as shards of a shared information space
|
||||
@@ -30,6 +32,7 @@ It allows wiki-like systems to:
|
||||
* Run fully standalone with open read/write access and complete change history, then progressively layer multi-tenant enterprise access control through external identity integration
|
||||
* Allow existing wiki engines to become federation-capable through a shared API
|
||||
* Allow non-federation-aware systems to participate through adapters and projections
|
||||
* Serve as a **headless, API-first wiki engine** (a small typed-extension core) that integrates heterogeneous data sources and is consumed efficiently by agents and automation
|
||||
|
||||
It transforms disconnected wiki engines, Git repositories, local folders, WebDAV directories, application-specific content stores, and desktop editing workflows into a **composable federated wiki space**.
|
||||
|
||||
@@ -85,7 +88,7 @@ A mature `shard-wiki` should allow each participating shard to see the others as
|
||||
|
||||
This repository is **not** intended to:
|
||||
|
||||
* Replace all wiki engines with a single canonical wiki implementation
|
||||
* Replace all wiki engines with a single canonical wiki implementation *(shard-wiki MAY still provide its own native, headless, API-first engine as one optional shard backend — see Design Principles — but never as a mandated or universal replacement)*
|
||||
* Force every shard to use the same backend, database, directory layout, or storage format
|
||||
* Require every participating system to become federation-aware
|
||||
* Require every participating shard to be Git-native
|
||||
@@ -148,6 +151,9 @@ Policy decisions such as conflict preference, canonical source selection, public
|
||||
* **Composable integration**
|
||||
Wiki engines should be able to use the `shard-wiki` API to become federation-enabled without reimplementing federation internally.
|
||||
|
||||
* **Native reference engine (additive, headless & API-first)**
|
||||
shard-wiki MAY provide its own native wiki-engine as a **canonical-mode shard backend** — a **small core** with a **typed-extension framework**, activated **per shard** (only what you need). It is **headless and API-first** (no bundled UI; presentation/rendering are consumer concerns) and tuned for **integrating heterogeneous data sources** and **efficient agent/automation access**. It is *one shard type among many*, implemented against shard-wiki's own adapter contract; it does **not** replace other engines, mandate a single implementation, or change shard-wiki's role as an orchestrator. Shard sovereignty and union-without-erasure are preserved.
|
||||
|
||||
* **Open by default, progressively governed**
|
||||
A standalone `shard-wiki` must be runnable with zero external dependencies in a classic Ward Cunningham / c2-style open read/write-for-all mode. Access control is an *additive capability*, not a precondition: the same core progresses — without re-architecture — to authenticated single-user, to group/role-based, to multi-tenant enterprise access control, mirroring the NetKingdom capability ladder (lightweight → expanded).
|
||||
|
||||
@@ -201,3 +207,5 @@ Such changes should be rare, because they affect all downstream systems relying
|
||||
|
||||
In particular, changes that redefine what counts as a shard, what role Git plays, how root entities are modeled, or whether `shard-wiki` is an orchestrator rather than a wiki engine should be treated as architectural changes.
|
||||
|
||||
**Amendment — 2026-06-15 (SHARD-WP-0013 T4, decision `84ffdb48`):** admits an **additive** native reference wiki-engine — **headless, API-first**, a small typed-extension core — as a **canonical-mode shard backend** optimized for data-source integration and agent access. Deliberate, narrow scope change; shard-wiki remains an orchestrator and neither mandates nor replaces other engines. (Mirrors the earlier auth-in-core amendment precedent.)
|
||||
|
||||
|
||||
14
SCOPE.md
14
SCOPE.md
@@ -17,12 +17,12 @@ Learnings update both SCOPE and INTENT where necessary.
|
||||
|
||||
| Layer | State |
|
||||
|-------|-------|
|
||||
| Code | Python package scaffold (`src/shard_wiki/`, smoke tests only) |
|
||||
| Code | Foundation slice implemented (SHARD-WP-0007): `provenance` + `policy` leaves, `model` (Identity/Placement/Span/Page/CapabilityProfile), `adapters` (contract + FolderAdapter + conformance suite), `coordination` (event-sourced DecisionLog), `union` (resolution + chorus, overlay-aware), `InformationSpace` orchestrator. Write path added (SHARD-WP-0008): writable adapter, overlay engine (draft→patch→apply-under-drift), edit() unifies write-through + overlay-before-mutation. Native engine implemented (SHARD-WP-0014): `engine` (kernel + typed-extension runtime + per-shard activation [ADR-0001] + capability-profile-from-extensions + EngineShardAdapter + the `ext.struct` built-in) — an engine shard attaches to an InformationSpace as a canonical-mode shard. Git-backed coordination log (SHARD-WP-0009): `DecisionLog` storage factored behind an `EventStore`; `GitEventStore` makes the log git-addressable (each space a ref, append = immutable CAS-guarded commit), a per-space `AppendAuthority` (lease) gives a single-writer total order with re-grantable HA hand-off, cross-process read-your-writes verified, and a verbatim one-time importer (`migrate_space`/JSONL) replays in-memory logs into git; `InformationSpace.git_backed(...)` wires it. Derived views (SHARD-WP-0010): `views` (wikilink + red-link model, BackLinks, RecentChanges, AllPages/SiteMap) — recomputable, provenance-carrying, presentation-free, exposed via `InformationSpace.backlinks/recent_changes/all_pages/site_map`. Incremental-first derived tier (SHARD-WP-0011): `incremental` (indexed equivalence via MinHash/LSH blocking + verify, change-driven delta maintenance with retraction/propagation, Merkle-style digest + self-healing I-2 consistency-checker, `UnionIndex` routed behind `InformationSpace.all_pages` with rebuild as explicit fallback). Second adapter (SHARD-WP-0012): `GitShardAdapter` — git-IS-store substrate (read=tracked *.md, write=commit, current_rev=per-path sha for drift, adopted git-native history), passes conformance, works across folder+git shards in union/overlay/edit with no core change (capability-as-data proven on a second substrate). 196 tests green, ~97% coverage |
|
||||
| Intent | `INTENT.md` established; authorization-in-core amendments drafted |
|
||||
| Research | yawex prior art; c2 origins; federation concepts; wikiengines overview (`research/260608-*/`); XWiki/TWiki/Foswiki deep dives (`research/260613-*/`); Xanadu + ZigZag + Roam + Obsidian + Notion + Joplin + Logseq + local-first workspaces (Anytype/AFFiNE/AppFlowy) + Trilium + Wiki.js deep dives & shard-spectrum synthesis (`research/260614-*/`) |
|
||||
| Research | yawex prior art; c2 origins; federation concepts; wikiengines overview (`research/260608-*/`); XWiki/TWiki/Foswiki deep dives (`research/260613-*/`); Xanadu + ZigZag + Roam + Obsidian + Notion + Joplin + Logseq + local-first workspaces (Anytype/AFFiNE/AppFlowy) + Trilium + Wiki.js + Federated Wiki + Wikibase + git-forge wikis + TiddlyWiki + ikiwiki + Quip + MojoMojo + Oddmuse + UseModWiki deep dives & shard-spectrum synthesis (`research/260614-*/`) |
|
||||
| Demand | NetKingdom integration asks captured, not yet negotiated |
|
||||
| Spec | Architecture blueprint drafted; UseCaseCatalog 69 UCs from research; PRD/TSD scaffolds |
|
||||
| Work | `SHARD-WP-0001` active (6 tasks); `SHARD-WP-0002` active (16 tasks: T1–T10 federation + T11–T16 adapter contract); `SHARD-WP-0003` active (9 engine dives); `SHARD-WP-0004` active (8 computational-knowledge dives) |
|
||||
| Spec | CoreArchitectureBlueprint (whole-system, hardened via SHARD-WP-0005/0006) + FederationArchitecture + FederationRequirements + TSD §A adapter contract + ArchitectureBlueprint (auth/history) + WikiEngineCoreArchitecture (headless API-first engine, SHARD-WP-0013) drafted; UseCaseCatalog 84 UCs (+ engine capability-structure layer); PRD scaffold |
|
||||
| Work | `SHARD-WP-0001` **done** (6 ADRs: yawex-derived federation requirements → `spec/FederationRequirements.md`); `SHARD-WP-0002` **done** (18 tasks → `FederationArchitecture.md` [T1–T10, T17] + `TechnicalSpecificationDocument.md` §A adapter contract [T11–T16, T18]); `SHARD-WP-0003` **done** (9 engine dives complete); `SHARD-WP-0004` **done** (all 8 computational-knowledge dives T1–T8 complete + "computational page model" synthesis); `SHARD-WP-0005` **done** (9 tasks: CoreArchitectureBlueprint hardened against the 260615 review); `SHARD-WP-0006` **done** (5 tasks: round-2 hardening — overview reconciled, event-sourced coordination + append authority, adapter conformance, incremental correctness + I-2 verification) |
|
||||
|
||||
## In Scope (today)
|
||||
|
||||
@@ -32,11 +32,15 @@ Learnings update both SCOPE and INTENT where necessary.
|
||||
- Authorization model design (delegated authentication, core authorization).
|
||||
- Shard adapter contract and wiki page model (to be specified, then implemented).
|
||||
- Git-backed coordination journal for information spaces.
|
||||
- A **native, headless, API-first wiki-engine core** (small typed-extension core, as a
|
||||
canonical-mode shard backend) — design via SHARD-WP-0013; optimized for data-source
|
||||
integration and agent access.
|
||||
- State Hub workplan registration and consistency sync.
|
||||
|
||||
## Out Of Scope (today)
|
||||
|
||||
- A standalone wiki engine UI or rendering pipeline.
|
||||
- A wiki-engine **UI or rendering pipeline** (the engine is headless/API-first; presentation
|
||||
is a consumer concern). A bundled standalone UI is not provided.
|
||||
- Authentication, credential storage, or user directory implementation.
|
||||
- Hard-coded editorial, sync, or conflict-resolution policy.
|
||||
- Generic file mirroring independent of wiki-page semantics.
|
||||
|
||||
91
history/260615-core-architecture-blueprint-review-2.md
Normal file
91
history/260615-core-architecture-blueprint-review-2.md
Normal file
@@ -0,0 +1,91 @@
|
||||
# Critical review (round 2) — CoreArchitectureBlueprint.md (hardened)
|
||||
|
||||
Date: 2026-06-15 · Reviewer: tegwick (with Claude) · Subject:
|
||||
`spec/CoreArchitectureBlueprint.md` after **SHARD-WP-0005** (commit f21b7b5) · Feeds:
|
||||
**SHARD-WP-0006**
|
||||
|
||||
A second hostile pass over the *hardened* blueprint. Round 1 found design bugs; this round
|
||||
finds (a) self-consistency regressions the surgical hardening introduced, and (b) deeper
|
||||
second-order gaps — some of which the hardening *sharpened*. Verdict: the architecture is now
|
||||
substantially sound; what remains in §B/§C are hard distributed-systems/operational questions,
|
||||
not design smells — except §A (a real regression) and three foundational gaps in §B.
|
||||
|
||||
---
|
||||
|
||||
## A. Self-consistency regressions introduced by surgical hardening
|
||||
|
||||
The 9 edits deepened §6–§9 but did not propagate to the **overview surfaces**, so the document
|
||||
now contradicts itself between its summary and its body (and readers trust the summary).
|
||||
|
||||
- **A-1 (real contradiction).** §4 still says "Addressing, **equivalence**, and transclusion
|
||||
key on identity" — the exact conflation T2 fixed in §7.2 (equivalence keys on *content
|
||||
fingerprint across distinct identities*). → **WP-0006 T1**
|
||||
- **A-2.** §4 "Projection — typed on two axes" and §4 "Provenance envelope … every artifact
|
||||
carries [full wrapper]" are stale vs T7's §8.4 (two-axis = extension point; trivial default)
|
||||
and §7.3 (layered effective-vs-own). → **T1**
|
||||
- **A-3.** §10 policy surface omits knobs the hardening added (freshness/staleness §8.8,
|
||||
squash-compaction §8.1, conflict-resolution preset §8.6, tenant-partition) — yet §11 defines
|
||||
`policy/` as "owns the §10 surface." The module contract points at a stale list. → **T1**
|
||||
- **A-4 (cosmetic).** §3 diagram + §11 header still say "L4 rebuildable cache" / "15 spectra,"
|
||||
advertising the pre-hardening model (§8.7 incremental-first; §6.5 orthogonal-core). → **T1**
|
||||
|
||||
Meta-point: surgical editing hardened the body but regressed whole-document coherence; v2
|
||||
needs an **overview-reconciliation pass**.
|
||||
|
||||
## B. Foundational gaps (serious; some sharpened by the hardening)
|
||||
|
||||
- **B-1 — The journal is now a concurrent-write DB, but it's single-writer Git.** §8.6's
|
||||
consistency model assumes "the journal is local Git, read-your-writes." L4 multi-tenant + the
|
||||
L6 Orchestrator API imply a server; HA/scale implies *multiple* instances. Concurrent commits
|
||||
of coordination-canonical state to one git journal = lock contention / merge races; Git is not
|
||||
a concurrent-write store. Either single-writer-per-space (an unstated HA ceiling) or a real
|
||||
concurrent coordination store with Git as an *export*. **T1 of WP-0005 worsened this** by
|
||||
loading more canonical state into the journal. The keystone unanswered question. → **T2**
|
||||
- **B-2 — Capability-as-data trusts self-reported profiles with no conformance check.** I-3 +
|
||||
§6.5's degradation contract assume the profile tells the truth. A buggy adapter (claims
|
||||
`merge=git/text`, corrupts; claims `notify`, never emits) silently poisons every degradation
|
||||
decision. No **adapter conformance suite** (declared profile == observed behavior) exists.
|
||||
Foundational for an architecture whose correctness rests on profile accuracy. → **T3**
|
||||
- **B-3 — "Coordination-canonical state in the journal" has no representation design.** T1
|
||||
relocated overlays/bindings/aliases/equivalence-sets/merges into "the journal" without saying
|
||||
*how* Git stores structured mutable state. "All equivalences touching X" over a git-of-files
|
||||
is O(scan) unless indexed — and an index is L4/derived. The new central concept is a black
|
||||
box; resolve *with* B-1. → **T2**
|
||||
- **B-4 — Incremental equivalence is under-specified/likely incorrect; I-2 only eventually
|
||||
true.** §8.7 re-verifies a changed page's *new* candidate set but not the pairs it *leaves*
|
||||
(a page exiting an LSH bucket can break an existing equivalence edge); the delta is not
|
||||
additive. Deeper: incremental maintenance drifts from `f(canonical)`, so I-2 holds only
|
||||
eventually, guaranteed solely by an expensive reconcile-against-rebuild. Needs a stated
|
||||
verification mechanism (background checker / digest-vs-sampled-rebuild). → **T4**
|
||||
|
||||
## C. Real but second-tier (track as open problems O-8…O-11)
|
||||
|
||||
- **C-1 — Mechanism-over-policy → operator burden; no preset bundles.** ~7 knob families with
|
||||
sub-modes and interactions; only authz (L0–L4) bundles into personas. Need named bundles
|
||||
("personal vault" / "team wiki" / "enterprise federation"). → **O-8 / T5**
|
||||
- **C-2 — Tenant partitioning (I-13) vs shard sharing + lazy projection.** A shard in two roots
|
||||
is cached twice → duplicate storage + double refresh on rate-limited backends. Shard
|
||||
exclusive-to-one-root or shareable? Unresolved. → **O-9 / T5**
|
||||
- **C-3 — Span-level authz + transclusion is an unmodeled leak path.** Authz is per
|
||||
page/shard/tenant; transclusion crosses shards at span granularity → a page can leak a span
|
||||
past its ACL (aggregation/inference). §7.3's ⊕ also stops being simple two-level inheritance
|
||||
across a transclusion boundary. → **O-10 / T5**
|
||||
- **C-4 — Union-under-unavailability undefined.** Freshness covers *stale*, nothing covers
|
||||
*down*. The dead-shard read path (partial union? error? last-known?) is unspecified though
|
||||
it's the commonest real failure. → **O-11 / T5**
|
||||
|
||||
## D. Recommended resolution (→ SHARD-WP-0006)
|
||||
|
||||
1. **§A reconciliation** (T1) — make the overview match the hardened body.
|
||||
2. **Journal & coordination-state model** (T2) — settle single-vs-multi-writer and separate the
|
||||
**coordination-state store** from the **content-history journal**. Likely resolution:
|
||||
**event-sourced coordination** — an append-only *decision log* is the coordination-canonical
|
||||
tier (git-addressable, I-6 preserved); the queryable current state (alias table, equivalence
|
||||
set) is a *derived fold* of the log (disposable). Append-logs tolerate concurrency far better
|
||||
than mutable-file Git; state a concurrency model. Resolves **B-1 + B-3** together.
|
||||
3. **Adapter conformance suite** (T3) — make a passing conformance run part of the contract
|
||||
(B-2): every adapter proves declared profile == observed behavior.
|
||||
4. **Incremental correctness + I-2 verification** (T4) — fix the leaving-bucket re-verification
|
||||
and propagation; add a background consistency-checker / derived-tier digest so I-2 is
|
||||
verifiable, not merely asserted (B-4).
|
||||
5. **Track §C** (T5) — O-8…O-11 with chosen direction + revisit trigger; close-out.
|
||||
122
history/260615-core-architecture-blueprint-review.md
Normal file
122
history/260615-core-architecture-blueprint-review.md
Normal file
@@ -0,0 +1,122 @@
|
||||
# Critical review — CoreArchitectureBlueprint.md
|
||||
|
||||
Date: 2026-06-15 · Reviewer: tegwick (with Claude) · Subject:
|
||||
`spec/CoreArchitectureBlueprint.md` @ commit **9b5b393** · Feeds: **SHARD-WP-0005**
|
||||
|
||||
A deliberately hostile review of the first whole-system architecture, to find where it
|
||||
**breaks (correctness)**, **fails to scale**, and **could be more elegant/efficient** before
|
||||
any implementation. Findings are prioritised; each is the input to a SHARD-WP-0005 task.
|
||||
|
||||
## Verdict in one line
|
||||
|
||||
The **layering and the dual narrow waist are sound and stay**. The **thesis is ~90% right**;
|
||||
the missing 10% (curatorial / coordination-canonical state) breaks its clean story. There are
|
||||
**two genuine bugs**, **two large unaddressed scaling risks**, and several **elegance/efficiency
|
||||
debts** — all fixable without touching INTENT.
|
||||
|
||||
---
|
||||
|
||||
## A. The framing crack (fix resolves three issues)
|
||||
|
||||
**A-1 — Two buckets hide a third.** The thesis "canonical at the edges, derived in the middle"
|
||||
omits **born-in-the-middle-but-canonical** state: overlays that are the local truth against a
|
||||
read-only shard (Flow C), manual **curator equivalence bindings**, alias tables, merge
|
||||
decisions. These encode human judgment or local-only content and **cannot be rebuilt** from
|
||||
shards+journal.
|
||||
|
||||
**Contradiction:** I-2 declares L4 rebuildable, yet §8.4 puts "alias table, curator binding"
|
||||
in L4. You cannot rebuild a curator's manual binding.
|
||||
|
||||
**Fix:** three states — **sharded-canonical**, **coordination-canonical** (journal: overlays,
|
||||
bindings, aliases, merges — durable, born in the middle), **derived-disposable** (union graph,
|
||||
indexes, projections). Re-frame §1 as **canonical (sharded + coordination) vs derived
|
||||
(disposable)**; `derived = f(canonical)` then becomes actually true. → **T1**
|
||||
|
||||
---
|
||||
|
||||
## B. Where it breaks (correctness)
|
||||
|
||||
**B-1 — Identity conflated with content-fingerprint (BUG).** §7.2 derives page identity from
|
||||
content fingerprint. That makes **editing a page change its identity**, breaking every
|
||||
reference. Fingerprints identify *versions/equivalence*, not *identity*. Page identity must be
|
||||
a **stable handle (uid)** surviving edits; fingerprints belong to the **equivalence** mechanism
|
||||
(§8.4). One word, two concepts, wrong implementation for the stable one. → **T2**
|
||||
|
||||
**B-2 — No concurrency/consistency model.** Concurrent overlays on one page, overlay applied
|
||||
after source drift, journal-commit vs shard-native-write ordering — all undefined. Conflict
|
||||
handling is deferred to "policy presets," but **conflict *detection + representation* is core
|
||||
mechanism**; only *resolution* is policy. The union's consistency guarantee is unstated
|
||||
(eventually-consistent? read-your-writes? causal-via-journal?). → **T3**
|
||||
|
||||
**B-3 — Persisted union cache + multi-tenant = leak surface.** §13 recommends a persisted L4
|
||||
cache; §9 protects content by *read-time* filtering on the provenance envelope. A persisted
|
||||
cross-tenant union cache guarded only by read-time filtering is an L4 attack surface. Tension
|
||||
between I-2 (persisted rebuildable cache), scale, and L5 isolation is unacknowledged. → **T8**
|
||||
|
||||
---
|
||||
|
||||
## C. Where it fails to scale
|
||||
|
||||
**C-1 — Equivalence detection is O(N²), no indexing/incremental story.** Fingerprint /
|
||||
span-set-overlap across all pages of all shards is combinatorial (10 shards × 100k pages ≈
|
||||
10¹² comparisons). No blocking/LSH/indexing, no incremental maintenance. Biggest scaling
|
||||
hazard in the document. → **T4**
|
||||
|
||||
**C-2 — "Rebuildable cache" collides with the operational-envelope axis.** A byte-exact
|
||||
rebuild requires reading *every page of every shard*, including rate-limited/paginated
|
||||
external APIs (Notion) and irreducibly-live sources — hours-to-days. I-2 contradicts axis-10.
|
||||
**Incremental, change-driven maintenance must be primary** (notify→delta), rebuild a rare
|
||||
fallback. Cache invalidation — the actual hard problem — is named once and never designed. →
|
||||
**T4, T5**
|
||||
|
||||
**C-3 — Unbounded history at open L0 = DoS/perf.** "Every write a commit" + "open for all" ⇒
|
||||
the git journal grows without bound under bots/vandalism and git degrades on huge histories.
|
||||
"History is the floor" has an unacknowledged cost: packing, compaction, per-shard offload. →
|
||||
**T8**
|
||||
|
||||
---
|
||||
|
||||
## D. Elegance / efficiency debts
|
||||
|
||||
**D-1 — The 15 spectra assert a clean degradation function never demonstrated.** Either most
|
||||
axes are irrelevant to most ops (then the 15-D profile is ceremony), or behavior depends on
|
||||
several axes *jointly* (then "no per-backend code" becomes a sprawling axis-interaction matrix
|
||||
— the flat-checklist problem in higher dimensions). And the axes **aren't orthogonal**
|
||||
(git-native history ⟺ git-IS-store ⟺ git/text merge; encrypted opacity ⟹ query/translation
|
||||
collapse). Model a **smaller orthogonal core** + **derived/implied** positions, and state the
|
||||
**axis-interaction subset** the degradation logic truly uses. → **T6**
|
||||
|
||||
**D-2 — Provenance envelope isn't inherited; it'll dwarf the content.** Per-span envelopes at
|
||||
block granularity = 10k near-identical envelopes for a 10k-block graph. The doc already
|
||||
invented the right pattern for Trilium ("effective-vs-own with per-attribute provenance") and
|
||||
failed to apply it to its own envelope. Make provenance **layered (page envelope + span
|
||||
deltas)**. → **T7**
|
||||
|
||||
**D-3 — Projection machinery over-fit to the exotic tail.** Two-axis model + three facets +
|
||||
view registry exist mostly for UC-83/84 (2 of 84 UCs); the 95% case (markdown in git) pays the
|
||||
weight. Make the **common case trivial** (default = plain lazy replication) and
|
||||
derivation/liveness an **extension point**, not a taxonomy every projection instantiates. →
|
||||
**T7**
|
||||
|
||||
**D-4 — Cross-cutting rails are the highest-coupling components, presented as clean.**
|
||||
`provenance/` and capability types are imported by every layer (god-modules); an envelope
|
||||
change ripples everywhere. And **policy has no module** (§10 enumerates it; §11 omits it)
|
||||
despite being consulted by L3/L4/L5. Give policy a home; pin the rails behind stable narrow
|
||||
interfaces. → **T7**
|
||||
|
||||
---
|
||||
|
||||
## E. What explicitly stays
|
||||
|
||||
- The 6-layer model + the dual narrow waist (adapter contract / page model).
|
||||
- Capability-as-data (I-3), union-without-erasure (I-4), overlay-before-mutation (I-5),
|
||||
Git-addressable coordination (I-6), mechanism-over-policy (I-7), graceful degradation (I-8).
|
||||
- The federation-model taxonomy and the auth ladder (ArchitectureBlueprint.md).
|
||||
|
||||
## F. Disposition
|
||||
|
||||
Some findings are **solvable now** (A-1, B-1, D-2, D-3, D-4, C-3); some are **partially open**
|
||||
and should be tracked honestly rather than pretend-solved (B-2 consistency model: pick a
|
||||
guarantee; C-1 equivalence-at-scale: pick a blocking strategy; D-1 axis interactions: enumerate
|
||||
the real subset). SHARD-WP-0005 closes the solvable ones and records the open ones in a new
|
||||
"Known scaling risks & open problems" section of the blueprint. → **T9**
|
||||
45
history/260615-reuse-surface-contributions.md
Normal file
45
history/260615-reuse-surface-contributions.md
Normal file
@@ -0,0 +1,45 @@
|
||||
# reuse-surface contributions — shard-wiki (SHARD-WP-0013 T3)
|
||||
|
||||
Date: 2026-06-15 · From: shard-wiki (whynot) · To: reuse-surface (helix_forge) · Tracked list
|
||||
of (A) capabilities shard-wiki registered, (B) **gaps** shard-wiki proposes the reuse surface
|
||||
add, and (C) capabilities shard-wiki will **consume** rather than rebuild. Communicated to the
|
||||
reuse-surface agent via state-hub `send_message`.
|
||||
|
||||
## A. Registered (T1 + T3) — 8 entries
|
||||
|
||||
`capability.wiki.{shard-orchestration, adapter-contract, page-model, coordination-journal,
|
||||
overlay, federation-models, engine-typed-extensions, derived-views}` — committed in
|
||||
reuse-surface, `validate` ok (20 entries total).
|
||||
|
||||
## B. Proposed gaps (cross-cutting; not shard-wiki-internal) → reuse-surface should own/define
|
||||
|
||||
- **G1 — `capability.platform.typed-extension-framework`** *(suggested)*
|
||||
A reusable pattern: a **small core + a stringent typed-extension framework** where extensions
|
||||
declare typed contracts, compose, and are **activated per context**. shard-wiki's wiki engine
|
||||
(`capability.wiki.engine-typed-extensions`) is one instance, but the *pattern* is
|
||||
cross-domain (any HelixForge capability platform). Evidence: shard-wiki UseCaseCatalog
|
||||
"Capability structure" layer (core + 10 typed extensions + conflict-mediation map).
|
||||
Suggested owner: helix_forge / reuse-surface. Relation: would `generalize`
|
||||
`capability.wiki.engine-typed-extensions`.
|
||||
|
||||
- **G2 — `capability.content.translation-fidelity`** *(suggested)*
|
||||
Lossless/lossy **content translation with an explicit fidelity report** (what round-trips
|
||||
cleanly vs degrades, non-mappable elements preserved as provenance). Reusable well beyond
|
||||
wiki (any format-bridging consumer). Evidence: shard-wiki TSD §A.6, UC-42/UC-59.
|
||||
|
||||
## C. Consumptions (reuse, do not rebuild)
|
||||
|
||||
- **`capability.feature-control.evaluate`** (helix_forge/feature-control) → shard-wiki's
|
||||
**per-shard extension/feature activation** (the "activate only what you need" mechanism).
|
||||
Already recorded as a relation on `capability.wiki.engine-typed-extensions`.
|
||||
- **`capability.authorization.policy-evaluate`** (flex-auth) → shard-wiki's **X-AUTHZ** policy
|
||||
decisions. shard-wiki owns the authz *model* (authz-in-core) but can reuse this evaluation
|
||||
engine rather than building one.
|
||||
- **`capability.statehub.{progress-log, workstream-coordinate}`** → already in use for
|
||||
coordination across this work.
|
||||
|
||||
## Status
|
||||
|
||||
Gaps G1/G2 are **suggestions** to the reuse-surface owner (not unilateral registrations, since
|
||||
they are cross-cutting, not shard-wiki-internal). Consumptions are recorded for the engine
|
||||
architecture (T5) so it reuses rather than reinvents.
|
||||
@@ -1,8 +1,19 @@
|
||||
# history/
|
||||
|
||||
Archived material that is no longer needed for daily work but should be kept.
|
||||
Archived material and the project's **meta-history**: finished/canceled workplans kept for
|
||||
the record, plus durable **reviews, critical assessments, and decision records** — the
|
||||
reasoning behind the specs, captured at a point in time.
|
||||
|
||||
Use a `yymmdd-` prefix when archiving files or directories. Content here is
|
||||
**out of scope** for regular tasks — consult only for research or diagnostics.
|
||||
Use a `yymmdd-` prefix. Archived material is **out of scope** for regular tasks (consult only
|
||||
for research or diagnostics); assessment/review records are point-in-time and may seed active
|
||||
workplans, but are not edited after the fact — supersede with a new dated record and link back.
|
||||
|
||||
Finished or canceled workplans from `workplans/` are archived here.
|
||||
Distinct from the **coordination journal** (a runtime Git-backed record of *content* change
|
||||
flows inside an information space, an INTENT domain concept); `history/` is the *project's own*
|
||||
design evolution.
|
||||
|
||||
| Date | Record | Subject |
|
||||
|------|--------|---------|
|
||||
| 2026-06-15 | `260615-core-architecture-blueprint-review.md` | Critical review of `spec/CoreArchitectureBlueprint.md` (commit 9b5b393); inputs to `SHARD-WP-0005` |
|
||||
| 2026-06-15 | `260615-core-architecture-blueprint-review-2.md` | Round-2 review of the hardened blueprint (post-`SHARD-WP-0005`, f21b7b5); inputs to `SHARD-WP-0006` |
|
||||
| 2026-06-15 | `260615-reuse-surface-contributions.md` | shard-wiki's reuse-surface registrations, proposed gaps (G1/G2), and consumptions (`SHARD-WP-0013` T3) |
|
||||
@@ -36,6 +36,11 @@ pythonpath = ["src"]
|
||||
branch = true
|
||||
source = ["shard_wiki"]
|
||||
|
||||
[tool.coverage.report]
|
||||
show_missing = true
|
||||
# Quality floor for `pytest --cov` / `coverage report` (not forced on a bare `pytest` run).
|
||||
fail_under = 90
|
||||
|
||||
[tool.ruff]
|
||||
src = ["src", "tests"]
|
||||
target-version = "py311"
|
||||
|
||||
12
registry/README.md
Normal file
12
registry/README.md
Normal file
@@ -0,0 +1,12 @@
|
||||
# Capability Registry
|
||||
|
||||
Markdown-first capability index for federation and reuse planning.
|
||||
|
||||
## Authoring
|
||||
|
||||
1. Copy a capability entry template (see reuse-surface `templates/capability-entry.template.md`).
|
||||
2. Add the row to `indexes/capabilities.yaml`.
|
||||
3. Run `reuse-surface validate` from a checkout with the CLI installed.
|
||||
4. Merge to `main` and verify publish with `reuse-surface establish --publish-check`.
|
||||
|
||||
Federation contract: reuse-surface `docs/RegistryFederation.md`.
|
||||
103
registry/capabilities/capability.wiki.adapter-contract.md
Normal file
103
registry/capabilities/capability.wiki.adapter-contract.md
Normal file
@@ -0,0 +1,103 @@
|
||||
---
|
||||
id: capability.wiki.adapter-contract
|
||||
name: Capability-Aware Shard Adapter Contract
|
||||
summary: A versioned backend interface where each binding declares a verified capability profile (positions on capability spectra), so federation ops degrade by capability.
|
||||
owner: shard-wiki
|
||||
status: draft
|
||||
domain: helix_forge
|
||||
tags: [wiki, adapter, capability, contract, conformance, shard-wiki]
|
||||
|
||||
maturity:
|
||||
discovery:
|
||||
current: D5
|
||||
target: D6
|
||||
confidence: high
|
||||
rationale: >
|
||||
Fifteen capability spectra with an orthogonal core + implication rules, plus
|
||||
a normative contract spec (TSD Section A); derived from a ~23-system synthesis.
|
||||
availability:
|
||||
current: A2
|
||||
target: A5
|
||||
confidence: medium
|
||||
rationale: >
|
||||
AdapterContract + a read/write FolderAdapter + a conformance suite that
|
||||
verifies declared profile == observed behaviour exist as a source module.
|
||||
|
||||
external_evidence:
|
||||
completeness:
|
||||
level: C2
|
||||
name: Partial
|
||||
confidence: medium
|
||||
basis: scope_vs_intent_and_consumer_expectations
|
||||
satisfied_expectations:
|
||||
- versioned interface with declared, conformance-verified capability profiles
|
||||
- one concrete adapter (file-store) passes the conformance suite
|
||||
broken_expectations:
|
||||
- only one substrate implemented (git-IS-store, REST, CRDT adapters planned)
|
||||
out_of_scope_expectations:
|
||||
- hosting backends
|
||||
reliability:
|
||||
level: R1
|
||||
confidence: low
|
||||
basis: consumer_quality_signals
|
||||
known_reliability_risks:
|
||||
- single adapter implemented so far
|
||||
|
||||
discovery:
|
||||
intent: >
|
||||
Mediate heterogeneity at one narrow waist: a backend participates by implementing a
|
||||
versioned interface and declaring a verified position on each capability spectrum.
|
||||
includes:
|
||||
- capability profile as data (orthogonal-core spectra + implied positions)
|
||||
- operation verbs (read/write/diff/merge/notify/.../derive-projection/execute)
|
||||
- a conformance suite (profiles verified, not self-asserted)
|
||||
excludes:
|
||||
- assuming uniform backend capabilities
|
||||
use_cases:
|
||||
- "shard-wiki UseCaseCatalog UC-34..UC-43, UC-50, UC-57, UC-60..UC-69 (shard attachment & adapter binding)"
|
||||
|
||||
availability:
|
||||
current_level: A2
|
||||
target_level: A5
|
||||
current_artifacts:
|
||||
- "shard-wiki/src/shard_wiki/adapters/"
|
||||
consumption_modes:
|
||||
- source module
|
||||
|
||||
relations:
|
||||
depends_on:
|
||||
- capability.wiki.page-model
|
||||
supports:
|
||||
- capability.wiki.shard-orchestration
|
||||
|
||||
evidence:
|
||||
documentation:
|
||||
- "shard-wiki/spec/TechnicalSpecificationDocument.md (Section A)"
|
||||
- "shard-wiki/spec/CoreArchitectureBlueprint.md (Section 6)"
|
||||
tests:
|
||||
- "shard-wiki/tests/test_folder_adapter.py"
|
||||
- "shard-wiki/tests/test_conformance.py"
|
||||
|
||||
consumer_guidance:
|
||||
recommended_for:
|
||||
- exposing any page store as a capability-described, conformance-checked shard
|
||||
not_recommended_for:
|
||||
- backends that cannot honestly describe their capabilities
|
||||
known_limitations:
|
||||
- reference implementation covers the file-store substrate only so far
|
||||
---
|
||||
|
||||
# Capability-Aware Shard Adapter Contract
|
||||
|
||||
The bottom narrow waist of shard-wiki: a versioned interface plus a **verified** capability
|
||||
profile per binding. Core logic is written once against capabilities (not per-backend), and
|
||||
the conformance suite rejects profiles whose declared abilities don't match observed behaviour.
|
||||
|
||||
## Assessment notes
|
||||
|
||||
### Discovery
|
||||
Fifteen spectra reduced to an orthogonal core with implication rules (CoreArchitectureBlueprint
|
||||
Section 6.5); normative in TSD Section A.
|
||||
|
||||
### Availability
|
||||
`adapters/` ships the contract, a folder adapter, and `assert_conformant`.
|
||||
103
registry/capabilities/capability.wiki.coordination-journal.md
Normal file
103
registry/capabilities/capability.wiki.coordination-journal.md
Normal file
@@ -0,0 +1,103 @@
|
||||
---
|
||||
id: capability.wiki.coordination-journal
|
||||
name: Event-Sourced Coordination Journal
|
||||
summary: An append-only, totally-ordered-per-space decision log (overlays, bindings, aliases, merges, forks) whose current state is a derived fold; git-addressable history.
|
||||
owner: shard-wiki
|
||||
status: draft
|
||||
domain: helix_forge
|
||||
tags: [wiki, event-sourcing, coordination, git, journal, shard-wiki]
|
||||
|
||||
maturity:
|
||||
discovery:
|
||||
current: D5
|
||||
target: D6
|
||||
confidence: high
|
||||
rationale: >
|
||||
Keystone resolved across two architecture reviews: coordination-canonical state
|
||||
as an append-only decision log with a per-space append authority; current state
|
||||
is a derived fold (derived = f(log)).
|
||||
availability:
|
||||
current: A2
|
||||
target: A4
|
||||
confidence: medium
|
||||
rationale: >
|
||||
In-memory DecisionLog + fold work as a source module; the git-backed store with a
|
||||
per-space lease (the production backing) is planned.
|
||||
|
||||
external_evidence:
|
||||
completeness:
|
||||
level: C2
|
||||
name: Partial
|
||||
confidence: medium
|
||||
basis: scope_vs_intent_and_consumer_expectations
|
||||
satisfied_expectations:
|
||||
- append-only, totally-ordered-per-space log with read-your-writes
|
||||
- derived fold to aliases + transitively-merged equivalence groups
|
||||
broken_expectations:
|
||||
- git-backed storage and per-space lease/append-authority not yet implemented
|
||||
out_of_scope_expectations:
|
||||
- general-purpose event bus
|
||||
reliability:
|
||||
level: R1
|
||||
confidence: low
|
||||
basis: consumer_quality_signals
|
||||
known_reliability_risks:
|
||||
- in-memory backing only; cross-process durability pending
|
||||
|
||||
discovery:
|
||||
intent: >
|
||||
Make coordination-canonical decisions durable and git-addressable as events, with the
|
||||
queryable current state always recomputable by replay.
|
||||
includes:
|
||||
- append-only decision log, totally ordered per information space
|
||||
- derived fold to current coordination state (aliases, equivalence groups, overlays)
|
||||
- per-space append authority (concurrency model)
|
||||
excludes:
|
||||
- storing derived/disposable union state
|
||||
use_cases:
|
||||
- "shard-wiki UseCaseCatalog UC-29, UC-33 (history, attribution, coordination journal)"
|
||||
|
||||
availability:
|
||||
current_level: A2
|
||||
target_level: A4
|
||||
current_artifacts:
|
||||
- "shard-wiki/src/shard_wiki/coordination/decision_log.py"
|
||||
target_artifacts:
|
||||
- git-backed log store with per-space lease
|
||||
consumption_modes:
|
||||
- source module
|
||||
|
||||
relations:
|
||||
supports:
|
||||
- capability.wiki.shard-orchestration
|
||||
- capability.wiki.overlay
|
||||
|
||||
evidence:
|
||||
documentation:
|
||||
- "shard-wiki/spec/CoreArchitectureBlueprint.md (Section 8.1)"
|
||||
tests:
|
||||
- "shard-wiki/tests/test_decision_log.py"
|
||||
|
||||
consumer_guidance:
|
||||
recommended_for:
|
||||
- durable, replayable, git-addressable coordination state for a federated space
|
||||
not_recommended_for:
|
||||
- high-frequency general event streaming
|
||||
known_limitations:
|
||||
- production git backing + lease are still on the roadmap (SHARD-WP-0009)
|
||||
---
|
||||
|
||||
# Event-Sourced Coordination Journal
|
||||
|
||||
The keystone: coordination-canonical state (overlays, equivalence bindings, aliases, merges,
|
||||
forks) is an append-only **decision log**, totally ordered per information space; the queryable
|
||||
current state is a derived **fold** of the log (`derived = f(log)`). The log is git-addressable,
|
||||
giving history/patch/review/backup for coordination decisions for free.
|
||||
|
||||
## Assessment notes
|
||||
|
||||
### Discovery
|
||||
Resolved across the round-1/round-2 architecture reviews (CoreArchitectureBlueprint Section 8.1).
|
||||
|
||||
### Availability
|
||||
`decision_log.py` ships an in-memory, totally-ordered log + fold; git+lease backing is planned.
|
||||
87
registry/capabilities/capability.wiki.derived-views.md
Normal file
87
registry/capabilities/capability.wiki.derived-views.md
Normal file
@@ -0,0 +1,87 @@
|
||||
---
|
||||
id: capability.wiki.derived-views
|
||||
name: Wiki Derived Views
|
||||
summary: Recomputable views over a wiki union — BackLinks, RecentChanges, AllPages, SiteMap, and (delegate-or-derive) Search — carrying provenance.
|
||||
owner: shard-wiki
|
||||
status: draft
|
||||
domain: helix_forge
|
||||
tags: [wiki, derived-views, backlinks, recentchanges, search, shard-wiki]
|
||||
|
||||
maturity:
|
||||
discovery:
|
||||
current: D3
|
||||
target: D5
|
||||
confidence: medium
|
||||
rationale: >
|
||||
Core-vs-adapter classification and behaviours are decided (FederationRequirements ADR-03);
|
||||
implementation is planned (SHARD-WP-0010), not built.
|
||||
availability:
|
||||
current: A0
|
||||
target: A4
|
||||
confidence: low
|
||||
rationale: >
|
||||
Designed; no implementation yet. Informational/planning reuse only today.
|
||||
|
||||
external_evidence:
|
||||
completeness:
|
||||
level: C0
|
||||
name: Absent
|
||||
confidence: low
|
||||
basis: scope_vs_intent_and_consumer_expectations
|
||||
satisfied_expectations: []
|
||||
broken_expectations:
|
||||
- no derived view is implemented yet
|
||||
out_of_scope_expectations:
|
||||
- presentation / rendering of views
|
||||
reliability:
|
||||
level: R0
|
||||
confidence: low
|
||||
basis: consumer_quality_signals
|
||||
known_reliability_risks:
|
||||
- planning-stage
|
||||
|
||||
discovery:
|
||||
intent: >
|
||||
Provide recomputable, provenance-carrying views over the union (link graph, change feed,
|
||||
enumeration, search) without introducing canonical state.
|
||||
includes:
|
||||
- BackLinks (link graph), RecentChanges (journal + shard signals), AllPages, SiteMap
|
||||
- Search as delegate-to-native-or-derive-index
|
||||
excludes:
|
||||
- view presentation / UI
|
||||
use_cases:
|
||||
- "shard-wiki UseCaseCatalog UC-17..UC-21, UC-63"
|
||||
|
||||
availability:
|
||||
current_level: A0
|
||||
target_level: A4
|
||||
current_artifacts:
|
||||
- "shard-wiki/workplans/SHARD-WP-0010-derived-views.md"
|
||||
consumption_modes:
|
||||
- informational
|
||||
|
||||
relations:
|
||||
depends_on:
|
||||
- capability.wiki.shard-orchestration
|
||||
- capability.wiki.page-model
|
||||
related_to:
|
||||
- capability.wiki.engine-typed-extensions
|
||||
|
||||
evidence:
|
||||
documentation:
|
||||
- "shard-wiki/spec/FederationRequirements.md (ADR-03)"
|
||||
|
||||
consumer_guidance:
|
||||
recommended_for:
|
||||
- planning derived navigation/discovery over a federated wiki union
|
||||
not_recommended_for:
|
||||
- implementation reuse today (planning-stage)
|
||||
known_limitations:
|
||||
- not implemented; Search ranking policy undecided
|
||||
---
|
||||
|
||||
# Wiki Derived Views
|
||||
|
||||
Recomputable views over the union (BackLinks, RecentChanges, AllPages, SiteMap, Search). All
|
||||
are derived/disposable (no canonical state) and carry provenance; Search is delegate-to-native
|
||||
where a shard's query capability allows, else a derived index. Planned in SHARD-WP-0010.
|
||||
115
registry/capabilities/capability.wiki.engine-typed-extensions.md
Normal file
115
registry/capabilities/capability.wiki.engine-typed-extensions.md
Normal file
@@ -0,0 +1,115 @@
|
||||
---
|
||||
id: capability.wiki.engine-typed-extensions
|
||||
name: Wiki Engine with Typed Extensions
|
||||
summary: A small-core wiki engine realizing a stringent typed-extension framework that addresses all wiki use cases and lets each shard activate only the features it needs.
|
||||
owner: shard-wiki
|
||||
status: draft
|
||||
domain: helix_forge
|
||||
tags: [wiki, engine, typed-extensions, feature-activation, shard-wiki]
|
||||
|
||||
maturity:
|
||||
discovery:
|
||||
current: D3
|
||||
target: D5
|
||||
confidence: medium
|
||||
rationale: >
|
||||
Architecture authored (shard-wiki/spec/WikiEngineCoreArchitecture.md): small page-store
|
||||
kernel + typed-extension framework, per-shard activation, engine-as-canonical-mode-shard,
|
||||
and a conflict-mediation realization are explored. Detailed extension SDK/ABI and the API
|
||||
protocol remain (so D3 Explored, not yet D4/D5).
|
||||
availability:
|
||||
current: A0
|
||||
target: A4
|
||||
confidence: low
|
||||
rationale: >
|
||||
Planned. No engine kernel or extensions exist yet; informational/planning reuse only.
|
||||
|
||||
external_evidence:
|
||||
completeness:
|
||||
level: C0
|
||||
name: Absent
|
||||
confidence: low
|
||||
basis: scope_vs_intent_and_consumer_expectations
|
||||
satisfied_expectations: []
|
||||
broken_expectations:
|
||||
- engine core and typed-extension mechanism not yet designed in detail
|
||||
out_of_scope_expectations:
|
||||
- replacing other wiki engines or mandating one implementation
|
||||
reliability:
|
||||
level: R0
|
||||
confidence: low
|
||||
basis: consumer_quality_signals
|
||||
known_reliability_risks:
|
||||
- planning-stage capability
|
||||
|
||||
discovery:
|
||||
intent: >
|
||||
Provide shard-wiki's reference first-party shard backend: a small core + a stringent
|
||||
typed-extension framework covering all collected use cases, mediating conflicting
|
||||
requirements into an integrated whole, with per-shard activation (only what you need).
|
||||
includes:
|
||||
- a minimal engine kernel (page lifecycle, storage via the adapter contract, the typing mechanism)
|
||||
- typed extensions that declare contracts and compose
|
||||
- per-shard feature activation
|
||||
excludes:
|
||||
- replacing or mandating other wiki engines (it is one shard type among many)
|
||||
- a single canonical implementation for all wikis
|
||||
use_cases:
|
||||
- "shard-wiki UseCaseCatalog UC-08..UC-25 and the full catalog (the engine must cover all)"
|
||||
|
||||
availability:
|
||||
current_level: A0
|
||||
target_level: A4
|
||||
current_artifacts:
|
||||
- "shard-wiki/workplans/SHARD-WP-0013-wiki-engine-prep.md"
|
||||
- "shard-wiki/spec/WikiEngineCoreArchitecture.md"
|
||||
consumption_modes:
|
||||
- informational
|
||||
|
||||
relations:
|
||||
depends_on:
|
||||
- capability.wiki.adapter-contract
|
||||
- capability.wiki.page-model
|
||||
related_to:
|
||||
- capability.feature-control.evaluate
|
||||
- capability.authorization.policy-evaluate
|
||||
|
||||
evidence:
|
||||
documentation:
|
||||
- "shard-wiki/workplans/SHARD-WP-0013-wiki-engine-prep.md"
|
||||
|
||||
consumer_guidance:
|
||||
recommended_for:
|
||||
- planning a composable, feature-activatable native wiki engine
|
||||
not_recommended_for:
|
||||
- implementation reuse today (planning-stage)
|
||||
known_limitations:
|
||||
- architecture authored; extension SDK/ABI + API protocol still to design; not yet built
|
||||
|
||||
promotion_history:
|
||||
- date: "2026-06-15"
|
||||
dimension: discovery
|
||||
from: D2
|
||||
to: D3
|
||||
rationale: WikiEngineCoreArchitecture.md authored (kernel + typed-extension framework explored); INTENT amendment ratified.
|
||||
author: shard-wiki
|
||||
---
|
||||
|
||||
# Wiki Engine with Typed Extensions
|
||||
|
||||
shard-wiki's planned reference first-party shard backend — a *canonical-mode shard* it
|
||||
implements natively: a small core plus a stringent typed-extension framework addressing all
|
||||
collected use cases, mediating conflicting requirements into a consistent whole, with per-shard
|
||||
activation (activate only what you need). It is one shard type among many — not a replacement
|
||||
for other engines. Per-shard activation is a candidate consumer of
|
||||
`capability.feature-control.evaluate`.
|
||||
|
||||
## Assessment notes
|
||||
|
||||
### Discovery
|
||||
Architecture authored: `shard-wiki/spec/WikiEngineCoreArchitecture.md` (small kernel +
|
||||
typed-extension framework; engine = canonical-mode shard). INTENT amendment ratified
|
||||
(2026-06-15, decision 84ffdb48). Extension SDK/ABI + API protocol are the next deliverables.
|
||||
|
||||
### Availability
|
||||
Planning-stage; informational reuse only.
|
||||
97
registry/capabilities/capability.wiki.federation-models.md
Normal file
97
registry/capabilities/capability.wiki.federation-models.md
Normal file
@@ -0,0 +1,97 @@
|
||||
---
|
||||
id: capability.wiki.federation-models
|
||||
name: Selectable Federation-Model Taxonomy
|
||||
summary: Federation as a plural, composable coordination axis (fork+journal, VCS-replication+ping, query-time graph-join, feed, activity-streams, engine-mirror) selected per space.
|
||||
owner: shard-wiki
|
||||
status: draft
|
||||
domain: helix_forge
|
||||
tags: [wiki, federation, taxonomy, composable, shard-wiki]
|
||||
|
||||
maturity:
|
||||
discovery:
|
||||
current: D4
|
||||
target: D6
|
||||
confidence: high
|
||||
rationale: >
|
||||
A six-model taxonomy distilled from a ~23-system synthesis, each model anchored in a
|
||||
real system, with capability prerequisites and per-space/per-shard composition rules.
|
||||
availability:
|
||||
current: A0
|
||||
target: A4
|
||||
confidence: low
|
||||
rationale: >
|
||||
Designed and specified (FederationArchitecture T17) but not implemented; informational
|
||||
reuse only today.
|
||||
|
||||
external_evidence:
|
||||
completeness:
|
||||
level: C1
|
||||
name: Sparse
|
||||
confidence: low
|
||||
basis: scope_vs_intent_and_consumer_expectations
|
||||
satisfied_expectations:
|
||||
- the model taxonomy and selection/composition rules are documented
|
||||
broken_expectations:
|
||||
- no federation transport is implemented yet
|
||||
out_of_scope_expectations:
|
||||
- mandating a single federation mechanism
|
||||
reliability:
|
||||
level: R0
|
||||
confidence: low
|
||||
basis: consumer_quality_signals
|
||||
known_reliability_risks:
|
||||
- design-stage; no runtime evidence
|
||||
|
||||
discovery:
|
||||
intent: >
|
||||
Treat federation as selectable and composable rather than one mechanism, so each space
|
||||
picks fork+journal, VCS-replication, query-join, feed, activity-streams, or engine-mirror.
|
||||
includes:
|
||||
- the six federation models + their capability floors
|
||||
- per-space selection and per-shard composition
|
||||
excludes:
|
||||
- imposing one homogeneous federation network
|
||||
use_cases:
|
||||
- "shard-wiki UseCaseCatalog UC-26, UC-31, UC-33, UC-71, UC-72, UC-74, UC-79"
|
||||
|
||||
availability:
|
||||
current_level: A0
|
||||
target_level: A4
|
||||
current_artifacts:
|
||||
- "shard-wiki/spec/FederationArchitecture.md (T17)"
|
||||
consumption_modes:
|
||||
- informational
|
||||
|
||||
relations:
|
||||
depends_on:
|
||||
- capability.wiki.shard-orchestration
|
||||
- capability.wiki.coordination-journal
|
||||
|
||||
evidence:
|
||||
documentation:
|
||||
- "shard-wiki/spec/FederationArchitecture.md"
|
||||
- "shard-wiki/research/260614-shard-spectrum-synthesis/findings.md"
|
||||
|
||||
consumer_guidance:
|
||||
recommended_for:
|
||||
- planning a federation strategy that mixes models per source
|
||||
not_recommended_for:
|
||||
- implementation reuse today (design-stage)
|
||||
known_limitations:
|
||||
- no transport implemented; informational planning reuse only
|
||||
---
|
||||
|
||||
# Selectable Federation-Model Taxonomy
|
||||
|
||||
Federation is plural and composable: fork+journal (Federated Wiki), VCS-replication+ping
|
||||
(ikiwiki), query-time graph-join (Wikibase SERVICE), feed aggregation, activity streams
|
||||
(ActivityPub), and engine-mirror (Wiki.js). A space selects a model and composes per shard;
|
||||
the default is fork+journal over git. Design-stage capability — strong for planning reuse.
|
||||
|
||||
## Assessment notes
|
||||
|
||||
### Discovery
|
||||
FederationArchitecture T17, distilled from the shard-spectrum synthesis (v3).
|
||||
|
||||
### Availability
|
||||
Specified, not implemented — informational reuse only.
|
||||
102
registry/capabilities/capability.wiki.overlay.md
Normal file
102
registry/capabilities/capability.wiki.overlay.md
Normal file
@@ -0,0 +1,102 @@
|
||||
---
|
||||
id: capability.wiki.overlay
|
||||
name: Overlay-Before-Mutation Write Path
|
||||
summary: Non-destructive edits (draft -> patch -> apply-under-drift) that let read-only, rate-limited, or lossy backends be edited safely without silent remote mutation.
|
||||
owner: shard-wiki
|
||||
status: draft
|
||||
domain: helix_forge
|
||||
tags: [wiki, overlay, patch, write-path, conflict, shard-wiki]
|
||||
|
||||
maturity:
|
||||
discovery:
|
||||
current: D5
|
||||
target: D6
|
||||
confidence: high
|
||||
rationale: >
|
||||
Overlay lifecycle and apply-under-drift semantics are specified (ADR-05, blueprint
|
||||
Section 8.6) and implemented as a single principled write path.
|
||||
availability:
|
||||
current: A2
|
||||
target: A4
|
||||
confidence: medium
|
||||
rationale: >
|
||||
OverlayEngine (draft/patch/apply), writable adapter, and InformationSpace.edit
|
||||
exist as a source module; three-way merge is not (refuse-on-drift only).
|
||||
|
||||
external_evidence:
|
||||
completeness:
|
||||
level: C2
|
||||
name: Partial
|
||||
confidence: medium
|
||||
basis: scope_vs_intent_and_consumer_expectations
|
||||
satisfied_expectations:
|
||||
- draft -> patch -> apply with fast-forward / refuse-on-drift / keep-draft outcomes
|
||||
- no silent remote mutation; overlay_state surfaced in provenance
|
||||
broken_expectations:
|
||||
- three-way / auto merge not implemented (refuse-on-conflict only)
|
||||
out_of_scope_expectations:
|
||||
- federation propagation of applied overlays
|
||||
reliability:
|
||||
level: R1
|
||||
confidence: low
|
||||
basis: consumer_quality_signals
|
||||
known_reliability_risks:
|
||||
- early implementation; conflict handling is detect-and-refuse only
|
||||
|
||||
discovery:
|
||||
intent: >
|
||||
Make any sub-write-through backend editable safely: an edit is an overlay first, applied
|
||||
only on explicit intent and only when the source has not drifted.
|
||||
includes:
|
||||
- overlay drafts recorded as coordination-canonical events
|
||||
- patch rendering (unified diff)
|
||||
- apply-under-drift (fast-forward / refuse / keep-draft)
|
||||
excludes:
|
||||
- destructive write without drift check
|
||||
use_cases:
|
||||
- "shard-wiki UseCaseCatalog UC-04, UC-26, UC-29 (remix primitives, overlay)"
|
||||
|
||||
availability:
|
||||
current_level: A2
|
||||
target_level: A4
|
||||
current_artifacts:
|
||||
- "shard-wiki/src/shard_wiki/coordination/overlay.py"
|
||||
- "shard-wiki/src/shard_wiki/coordination/patch.py"
|
||||
consumption_modes:
|
||||
- source module
|
||||
|
||||
relations:
|
||||
depends_on:
|
||||
- capability.wiki.coordination-journal
|
||||
- capability.wiki.adapter-contract
|
||||
|
||||
evidence:
|
||||
documentation:
|
||||
- "shard-wiki/spec/FederationRequirements.md (ADR-05)"
|
||||
- "shard-wiki/spec/CoreArchitectureBlueprint.md (Section 8.2, 8.6)"
|
||||
tests:
|
||||
- "shard-wiki/tests/test_apply.py"
|
||||
- "shard-wiki/tests/test_write_path_integration.py"
|
||||
|
||||
consumer_guidance:
|
||||
recommended_for:
|
||||
- safe editing over read-only / rate-limited / lossy backends
|
||||
not_recommended_for:
|
||||
- workflows needing automatic conflict resolution today
|
||||
known_limitations:
|
||||
- merge is detect-and-refuse; three-way merge is future work
|
||||
---
|
||||
|
||||
# Overlay-Before-Mutation Write Path
|
||||
|
||||
One principled write path: every edit drafts an overlay (a coordination-canonical event),
|
||||
renders as a patch, and applies under drift checks — fast-forwarding a writable target,
|
||||
keeping a local draft on a read-only target, and refusing (never clobbering) on external drift.
|
||||
|
||||
## Assessment notes
|
||||
|
||||
### Discovery
|
||||
Specified in FederationRequirements ADR-05 and CoreArchitectureBlueprint Section 8.2/8.6.
|
||||
|
||||
### Availability
|
||||
`overlay.py` + `patch.py` + `InformationSpace.edit` ship the path; built in SHARD-WP-0008.
|
||||
104
registry/capabilities/capability.wiki.page-model.md
Normal file
104
registry/capabilities/capability.wiki.page-model.md
Normal file
@@ -0,0 +1,104 @@
|
||||
---
|
||||
id: capability.wiki.page-model
|
||||
name: Backend-Neutral Wiki Page Model
|
||||
summary: A Markdown-first but stretchable page model with stable identity separate from placement and layered provenance, spanning prose to typed-graph and computational shapes.
|
||||
owner: shard-wiki
|
||||
status: draft
|
||||
domain: helix_forge
|
||||
tags: [wiki, page-model, identity, provenance, markdown, shard-wiki]
|
||||
|
||||
maturity:
|
||||
discovery:
|
||||
current: D5
|
||||
target: D6
|
||||
confidence: high
|
||||
rationale: >
|
||||
Page shapes (prose, typed records, typed-graph, inline-embedded, non-Markdown,
|
||||
and four computational shapes) plus identity != placement and layered provenance
|
||||
are specified and grounded in the dive research.
|
||||
availability:
|
||||
current: A2
|
||||
target: A5
|
||||
confidence: medium
|
||||
rationale: >
|
||||
Identity/Placement/Span/Page and layered ProvenanceEnvelope exist as a source
|
||||
module; richer shapes (typed-graph, notebook) are modeled but not all built.
|
||||
|
||||
external_evidence:
|
||||
completeness:
|
||||
level: C2
|
||||
name: Partial
|
||||
confidence: medium
|
||||
basis: scope_vs_intent_and_consumer_expectations
|
||||
satisfied_expectations:
|
||||
- stable identity distinct from placement and from content fingerprint
|
||||
- layered (effective-vs-own) provenance with near-zero per-span cost
|
||||
broken_expectations:
|
||||
- non-prose shapes (typed-graph, notebook, inline-embedded) not fully realized
|
||||
out_of_scope_expectations:
|
||||
- rendering / presentation
|
||||
reliability:
|
||||
level: R1
|
||||
confidence: low
|
||||
basis: consumer_quality_signals
|
||||
known_reliability_risks:
|
||||
- prose shape is the only exercised path so far
|
||||
|
||||
discovery:
|
||||
intent: >
|
||||
One backend-neutral lingua franca every consumer sees; every shape reduces to
|
||||
(content|source, structure, provenance envelope, optional derivation rule).
|
||||
includes:
|
||||
- page identity (stable handle) vs placement (N paths/shards) vs equivalence (fingerprint)
|
||||
- layered provenance envelope (page + span deltas)
|
||||
- page-shape taxonomy incl. computational shapes
|
||||
excludes:
|
||||
- deriving identity from content (a fingerprint identifies a version, not a page)
|
||||
use_cases:
|
||||
- "shard-wiki UseCaseCatalog UC-34, UC-39, UC-44..UC-49, UC-55, UC-73, UC-83, UC-84"
|
||||
|
||||
availability:
|
||||
current_level: A2
|
||||
target_level: A5
|
||||
current_artifacts:
|
||||
- "shard-wiki/src/shard_wiki/model/"
|
||||
- "shard-wiki/src/shard_wiki/provenance/"
|
||||
consumption_modes:
|
||||
- source module
|
||||
|
||||
relations:
|
||||
supports:
|
||||
- capability.wiki.adapter-contract
|
||||
- capability.wiki.shard-orchestration
|
||||
|
||||
evidence:
|
||||
documentation:
|
||||
- "shard-wiki/spec/CoreArchitectureBlueprint.md (Section 7)"
|
||||
- "shard-wiki/spec/FederationRequirements.md (ADR-02, ADR-04)"
|
||||
tests:
|
||||
- "shard-wiki/tests/test_model.py"
|
||||
- "shard-wiki/tests/test_provenance.py"
|
||||
|
||||
consumer_guidance:
|
||||
recommended_for:
|
||||
- a portable, provenance-carrying representation of wiki pages across backends
|
||||
not_recommended_for:
|
||||
- cases needing a single canonical path per page (use identity, not path)
|
||||
known_limitations:
|
||||
- non-prose shapes specified ahead of implementation
|
||||
---
|
||||
|
||||
# Backend-Neutral Wiki Page Model
|
||||
|
||||
The top narrow waist: a Markdown-first model that stretches to typed records, typed-graph
|
||||
statements, inline-embedded objects, non-Markdown assets, and computational shapes. Identity
|
||||
is a stable handle; placement and equivalence are separate mechanisms; provenance is layered
|
||||
(effective = page envelope + span delta).
|
||||
|
||||
## Assessment notes
|
||||
|
||||
### Discovery
|
||||
Specified in CoreArchitectureBlueprint Section 7 and FederationRequirements ADR-02/04.
|
||||
|
||||
### Availability
|
||||
`model/` + `provenance/` ship the prose path and the layered envelope today.
|
||||
114
registry/capabilities/capability.wiki.shard-orchestration.md
Normal file
114
registry/capabilities/capability.wiki.shard-orchestration.md
Normal file
@@ -0,0 +1,114 @@
|
||||
---
|
||||
id: capability.wiki.shard-orchestration
|
||||
name: Wiki Shard Orchestration
|
||||
summary: Present a union of pages across heterogeneous wiki-shaped shards while preserving each shard's provenance, capabilities, and history.
|
||||
owner: shard-wiki
|
||||
status: draft
|
||||
domain: helix_forge
|
||||
tags: [wiki, federation, orchestration, union, shard-wiki]
|
||||
|
||||
maturity:
|
||||
discovery:
|
||||
current: D5
|
||||
target: D6
|
||||
confidence: high
|
||||
rationale: >
|
||||
Grounded in 84 documented use cases and a twice-reviewed whole-system
|
||||
architecture (CoreArchitectureBlueprint) derived from ~23 prior-art systems.
|
||||
availability:
|
||||
current: A2
|
||||
target: A5
|
||||
confidence: medium
|
||||
rationale: >
|
||||
InformationSpace orchestrator (attach -> resolve -> read, chorus on
|
||||
ambiguity) works as a Python source module; network API and incremental
|
||||
union are planned.
|
||||
|
||||
external_evidence:
|
||||
completeness:
|
||||
level: C2
|
||||
name: Partial
|
||||
confidence: medium
|
||||
basis: scope_vs_intent_and_consumer_expectations
|
||||
satisfied_expectations:
|
||||
- attach folder shards and read a union page with layered provenance
|
||||
- chorus presentation of equivalent-but-divergent pages (union without erasure)
|
||||
broken_expectations:
|
||||
- incremental union maintenance and equivalence index not yet built
|
||||
- write-through federation transports not yet built
|
||||
out_of_scope_expectations:
|
||||
- hosting or replacing the underlying wiki engines
|
||||
reliability:
|
||||
level: R1
|
||||
confidence: low
|
||||
basis: consumer_quality_signals
|
||||
known_reliability_risks:
|
||||
- early implementation; 64 tests but no production exposure
|
||||
|
||||
discovery:
|
||||
intent: >
|
||||
Let independently stored, differently implemented wikis behave as one
|
||||
coherent, versionable, inspectable information space without homogenizing them.
|
||||
includes:
|
||||
- union resolution across shards (identity-keyed)
|
||||
- chorus / designated-canonical presentation of equivalent pages
|
||||
- lazy replication projection of remote content with freshness
|
||||
excludes:
|
||||
- implementing a backend wiki engine (see capability.wiki.engine-typed-extensions)
|
||||
- silent remote mutation
|
||||
assumptions:
|
||||
- canonical truth lives in shards + a git coordination journal; the union is derived
|
||||
use_cases:
|
||||
- "shard-wiki UseCaseCatalog UC-01..UC-07, UC-26..UC-33 (information space, federation, coordination)"
|
||||
|
||||
availability:
|
||||
current_level: A2
|
||||
target_level: A5
|
||||
current_artifacts:
|
||||
- "shard-wiki/src/shard_wiki/union/"
|
||||
- "shard-wiki/src/shard_wiki/space.py"
|
||||
target_artifacts:
|
||||
- orchestrator network API
|
||||
consumption_modes:
|
||||
- source module
|
||||
|
||||
relations:
|
||||
depends_on:
|
||||
- capability.wiki.adapter-contract
|
||||
- capability.wiki.page-model
|
||||
- capability.wiki.coordination-journal
|
||||
supports:
|
||||
- capability.wiki.federation-models
|
||||
|
||||
evidence:
|
||||
documentation:
|
||||
- "shard-wiki/spec/CoreArchitectureBlueprint.md"
|
||||
- "shard-wiki/spec/FederationArchitecture.md"
|
||||
tests:
|
||||
- "shard-wiki/tests/test_union.py"
|
||||
- "shard-wiki/tests/test_integration.py"
|
||||
|
||||
consumer_guidance:
|
||||
recommended_for:
|
||||
- composing multiple Markdown/wiki stores into one provenance-preserving view
|
||||
not_recommended_for:
|
||||
- replacing a single wiki engine
|
||||
known_limitations:
|
||||
- resolution is recompute-on-read until the incremental tier lands
|
||||
---
|
||||
|
||||
# Wiki Shard Orchestration
|
||||
|
||||
shard-wiki's core capability: orchestrate wiki-shaped content across heterogeneous *shards*
|
||||
as a union of pages, preserving provenance, capabilities, and history per shard. Canonical
|
||||
truth stays at the edges (shards + the git coordination journal); the union is a derived,
|
||||
recomputable view (orchestrator, not engine).
|
||||
|
||||
## Assessment notes
|
||||
|
||||
### Discovery
|
||||
Grounded by `UseCaseCatalog.md` (84 UCs) and the hardened `CoreArchitectureBlueprint.md`.
|
||||
|
||||
### Availability
|
||||
`InformationSpace` provides attach/resolve/read today (source module); a network API is the
|
||||
target availability step.
|
||||
145
registry/indexes/capabilities.yaml
Normal file
145
registry/indexes/capabilities.yaml
Normal file
@@ -0,0 +1,145 @@
|
||||
version: 1
|
||||
updated: '2026-06-16'
|
||||
domain: helix_forge
|
||||
capabilities:
|
||||
- id: capability.wiki.shard-orchestration
|
||||
name: Wiki Shard Orchestration
|
||||
summary: Present a union of pages across heterogeneous wiki-shaped shards while
|
||||
preserving each shard's provenance, capabilities, and history.
|
||||
vector: D5 / A2 / C2 / R1
|
||||
domain: helix_forge
|
||||
status: draft
|
||||
owner: shard-wiki
|
||||
path: registry/capabilities/capability.wiki.shard-orchestration.md
|
||||
tags:
|
||||
- wiki
|
||||
- federation
|
||||
- orchestration
|
||||
- union
|
||||
- shard-wiki
|
||||
consumption_modes:
|
||||
- source module
|
||||
- id: capability.wiki.adapter-contract
|
||||
name: Capability-Aware Shard Adapter Contract
|
||||
summary: A versioned backend interface where each binding declares a verified capability
|
||||
profile, so federation ops degrade by capability.
|
||||
vector: D5 / A2 / C2 / R1
|
||||
domain: helix_forge
|
||||
status: draft
|
||||
owner: shard-wiki
|
||||
path: registry/capabilities/capability.wiki.adapter-contract.md
|
||||
tags:
|
||||
- wiki
|
||||
- adapter
|
||||
- capability
|
||||
- contract
|
||||
- conformance
|
||||
- shard-wiki
|
||||
consumption_modes:
|
||||
- source module
|
||||
- id: capability.wiki.page-model
|
||||
name: Backend-Neutral Wiki Page Model
|
||||
summary: A Markdown-first but stretchable page model with stable identity separate
|
||||
from placement and layered provenance.
|
||||
vector: D5 / A2 / C2 / R1
|
||||
domain: helix_forge
|
||||
status: draft
|
||||
owner: shard-wiki
|
||||
path: registry/capabilities/capability.wiki.page-model.md
|
||||
tags:
|
||||
- wiki
|
||||
- page-model
|
||||
- identity
|
||||
- provenance
|
||||
- markdown
|
||||
- shard-wiki
|
||||
consumption_modes:
|
||||
- source module
|
||||
- id: capability.wiki.coordination-journal
|
||||
name: Event-Sourced Coordination Journal
|
||||
summary: An append-only, totally-ordered-per-space decision log whose current state
|
||||
is a derived fold; git-addressable history.
|
||||
vector: D5 / A2 / C2 / R1
|
||||
domain: helix_forge
|
||||
status: draft
|
||||
owner: shard-wiki
|
||||
path: registry/capabilities/capability.wiki.coordination-journal.md
|
||||
tags:
|
||||
- wiki
|
||||
- event-sourcing
|
||||
- coordination
|
||||
- git
|
||||
- journal
|
||||
- shard-wiki
|
||||
consumption_modes:
|
||||
- source module
|
||||
- id: capability.wiki.overlay
|
||||
name: Overlay-Before-Mutation Write Path
|
||||
summary: Non-destructive edits (draft -> patch -> apply-under-drift) that let read-only
|
||||
or limited backends be edited safely without silent remote mutation.
|
||||
vector: D5 / A2 / C2 / R1
|
||||
domain: helix_forge
|
||||
status: draft
|
||||
owner: shard-wiki
|
||||
path: registry/capabilities/capability.wiki.overlay.md
|
||||
tags:
|
||||
- wiki
|
||||
- overlay
|
||||
- patch
|
||||
- write-path
|
||||
- conflict
|
||||
- shard-wiki
|
||||
consumption_modes:
|
||||
- source module
|
||||
- id: capability.wiki.federation-models
|
||||
name: Selectable Federation-Model Taxonomy
|
||||
summary: Federation as a plural, composable coordination axis (fork+journal, VCS-replication,
|
||||
query-join, feed, activity-streams, engine-mirror) selected per space.
|
||||
vector: D4 / A0 / C1 / R0
|
||||
domain: helix_forge
|
||||
status: draft
|
||||
owner: shard-wiki
|
||||
path: registry/capabilities/capability.wiki.federation-models.md
|
||||
tags:
|
||||
- wiki
|
||||
- federation
|
||||
- taxonomy
|
||||
- composable
|
||||
- shard-wiki
|
||||
consumption_modes:
|
||||
- informational
|
||||
- id: capability.wiki.engine-typed-extensions
|
||||
name: Wiki Engine with Typed Extensions
|
||||
summary: A small-core wiki engine realizing a typed-extension framework that addresses
|
||||
all wiki use cases and lets each shard activate only the features it needs.
|
||||
vector: D3 / A0 / C0 / R0
|
||||
domain: helix_forge
|
||||
status: draft
|
||||
owner: shard-wiki
|
||||
path: registry/capabilities/capability.wiki.engine-typed-extensions.md
|
||||
tags:
|
||||
- wiki
|
||||
- engine
|
||||
- typed-extensions
|
||||
- feature-activation
|
||||
- shard-wiki
|
||||
consumption_modes:
|
||||
- informational
|
||||
- id: capability.wiki.derived-views
|
||||
name: Wiki Derived Views
|
||||
summary: Recomputable views over a wiki union — BackLinks, RecentChanges, AllPages,
|
||||
SiteMap, and (delegate-or-derive) Search — carrying provenance.
|
||||
vector: D3 / A0 / C0 / R0
|
||||
domain: helix_forge
|
||||
status: draft
|
||||
owner: shard-wiki
|
||||
path: registry/capabilities/capability.wiki.derived-views.md
|
||||
tags:
|
||||
- wiki
|
||||
- derived-views
|
||||
- backlinks
|
||||
- recentchanges
|
||||
- search
|
||||
- shard-wiki
|
||||
consumption_modes:
|
||||
- informational
|
||||
38
research/260614-computational-page-model-synthesis/README.md
Normal file
38
research/260614-computational-page-model-synthesis/README.md
Normal file
@@ -0,0 +1,38 @@
|
||||
# 260614 — The computational page model (SHARD-WP-0004 synthesis)
|
||||
|
||||
Date: 2026-06-15 · Source: **SHARD-WP-0004** post-batch synthesis (T1–T8)
|
||||
|
||||
## What this is
|
||||
|
||||
The acceptance-criteria synthesis for SHARD-WP-0004 — *"the computational page model"* —
|
||||
reading the eight computational/interactive-knowledge dives across each other and distilling
|
||||
them into **one model**: **source is canonical; everything rendered/computed is a
|
||||
projection**, placed on **two axes** (projection-kind: replication vs derivation; liveness:
|
||||
live↔snapshot), with a recommendation on an executable-content capability.
|
||||
|
||||
## The answer to the carried question
|
||||
|
||||
*Can a shard-wiki page be a live computational artifact?* **Yes — as a page-model +
|
||||
projection concern, not as an execution platform.** Every system externalizes to a canonical
|
||||
source and treats the live/computed form as derived; shard-wiki **recognizes** computational
|
||||
content, **attaches the source**, and **presents derivations as provenance- and
|
||||
liveness-marked projections**, with **execution as a gated capability (off by default,
|
||||
degrade to snapshot)**. No INTENT amendment required.
|
||||
|
||||
## Key contributions
|
||||
|
||||
- **One model:** `(source, derivation rule, projection with provenance + liveness)` covers
|
||||
all four computational page shapes (one-source-many-projections UC-83; notebook UC-84;
|
||||
program-as-page; live/temporal content).
|
||||
- **Two axes for T16:** replication vs **derivation-projection** (timing / multiplicity /
|
||||
continuity facets) × the **live↔snapshot** axis (bounded at the irreducibly-live far end by
|
||||
Strudel).
|
||||
- **One snapshot-provenance record** reused for notebook outputs, renders, recordings.
|
||||
- **Hard boundaries:** never host a kernel/runtime as store; **image-is-not-a-store**; never
|
||||
present a derivation without output→source provenance.
|
||||
|
||||
## Contents
|
||||
|
||||
| Path | Role |
|
||||
|------|------|
|
||||
| `findings.md` | The source/derivation/projection model, the two axes, the four page shapes, provenance/reproducibility, the recommendation, the SHARD-WP-0002 fold-in, escalated open questions |
|
||||
178
research/260614-computational-page-model-synthesis/findings.md
Normal file
178
research/260614-computational-page-model-synthesis/findings.md
Normal file
@@ -0,0 +1,178 @@
|
||||
# The computational page model — synthesis (SHARD-WP-0004)
|
||||
|
||||
**Date:** 2026-06-15 · **Source:** SHARD-WP-0004 (post-batch synthesis, T1–T8) · **Kind:**
|
||||
synthesis (no new external research) reading the eight computational/interactive-knowledge
|
||||
dives *across* each other.
|
||||
|
||||
## What this is
|
||||
|
||||
The acceptance-criteria synthesis for SHARD-WP-0004. The batch asked one carried question:
|
||||
*can a shard-wiki page be a live computational artifact — a source woven/evaluated into
|
||||
rendered forms — and if so, how do projection, transclusion, provenance, and the adapter
|
||||
contract treat the source, the environment, and the computed output?* This memo answers it
|
||||
by distilling the eight dives into **one model**: the **source / derivation / projection**
|
||||
view of computational content, anchored on two axes, plus a recommendation on whether
|
||||
shard-wiki needs an executable-content capability.
|
||||
|
||||
Dives consolidated: **T1** literate programming (WEB/weave/tangle), **T2** Mathematica,
|
||||
**T3** Jupyter, **T4** Processing/p5.js, **T5** Strudel, **T6** Squeak, **T7** Glamorous
|
||||
Toolkit, **T8** Pharo. Catalog yield: **UC-83** (literate one-source-many-projections),
|
||||
**UC-84** (notebook with computed-output provenance); the other six are **enrichment/boundary**
|
||||
dives (UC-54/55/47/48 + the projection model).
|
||||
|
||||
## 1. The one finding: source is canonical, everything rendered is a projection
|
||||
|
||||
Every system in the batch, however "live," **externalizes to a canonical artifact** and
|
||||
treats the rendered/computed form as **derived**:
|
||||
|
||||
| System | Canonical (attach this) | Derived (a projection) |
|
||||
|--------|-------------------------|------------------------|
|
||||
| Literate (WEB) | the WEB/`.nw`/`.org` **source** | woven docs **and** tangled code |
|
||||
| Mathematica | the `.nb` (a Wolfram expression) | cached `Output` cells; CDF; `Dynamic` widgets |
|
||||
| Jupyter | `.ipynb` source (ideally paired text) | embedded outputs; nbconvert/nbviewer renders |
|
||||
| Processing | the **sketch source** | the view-time canvas render |
|
||||
| Strudel | the **pattern source** | the live audio performance / a recording |
|
||||
| GT / Lepiter | git-versionable **page/Tonel files** | moldable `gtView`s; live snippet results |
|
||||
| Squeak/Pharo | exported **files** (Tonel/git) | the live image / running objects |
|
||||
|
||||
**This is the same principle shard-wiki already holds** (files-canonical, index/render
|
||||
derived — ikiwiki UC-79, Logseq UC-62, nbstripout). The computational batch did not break it;
|
||||
it **stress-tested it to the live extreme and it held**. *The image (Squeak/Pharo) is the
|
||||
only would-be exception, and it is a boundary, not a counterexample: an image is not a store.*
|
||||
|
||||
## 2. Two axes the contract must add
|
||||
|
||||
The batch refines the **projection model** (SHARD-WP-0002 T16) with two orthogonal axes:
|
||||
|
||||
### 2.1 Projection kind: replication vs derivation
|
||||
- **Replication-projection** (the existing default): a lazy **cache/copy** of remote content
|
||||
(Obsidian/Notion mirrors, UC-53/57).
|
||||
- **Derivation-projection** (new, from T1): a **transform/compile/weave/evaluate** of a
|
||||
source into rendered forms — regenerable, may delegate to the source's tool, **degrades to
|
||||
a captured snapshot** when the tool is absent. Covers weave/tangle, nbconvert, CDF, sketch
|
||||
render, audio recording, `gtView`.
|
||||
|
||||
Sub-facets of derivation (from T4): **materialization timing** = *ahead-of-time* (CDF,
|
||||
nbconvert, static HTML) vs *view-time* (Processing/Strudel); **multiplicity** = one output
|
||||
(UC-79) vs **N co-equal projections** (UC-83 weave+tangle; UC-47/48/54 moldable views).
|
||||
|
||||
### 2.2 Liveness: the live↔snapshot axis (from T6, bounded by T5)
|
||||
Every derived view sits on a spectrum, and the more live the source, the more its static form
|
||||
is a clearly-marked degrading snapshot:
|
||||
|
||||
```
|
||||
static source ── captured output ── live-over-files ── view-time one-shot ── continuous/
|
||||
(literate) (notebook UC-84) (GT/Lepiter) (Processing) interactive
|
||||
── irreducibly
|
||||
live/temporal
|
||||
(Strudel: source
|
||||
+ recording only)
|
||||
```
|
||||
|
||||
**Honesty rule (union-without-erasure):** a computed/live view must always declare *what it
|
||||
is* — "captured at run N, environment unguaranteed" (UC-84), "one performance, time T, source
|
||||
rev R" (UC-83/Strudel), "live render needs the runtime." Never present a snapshot as live or a
|
||||
static page as capturing a live artifact.
|
||||
|
||||
## 3. The computational page-model shapes (T12)
|
||||
|
||||
The batch adds these page shapes (beyond prose / typed records / query-defined / inline-
|
||||
embedded objects / typed-graph already catalogued):
|
||||
|
||||
1. **One-source-many-projections** (UC-83) — a source whose presented forms are co-equal
|
||||
derivations (docs + code), each with output→source provenance; **named-chunk transclusion**
|
||||
assembles fragments by name at derivation time (UC-32/44).
|
||||
2. **Notebook** (UC-84) — **ordered/nestable cells** (Mathematica adds the outline tree)
|
||||
where code cells own **embedded computed outputs** (the derived output is stored *inside*
|
||||
the source) with **weak execution provenance**; outputs may be MIME blobs or **structured
|
||||
re-evaluable values** (Mathematica) — a new point on the content-opacity spectrum.
|
||||
3. **Program-as-page** (Processing) — canonical content = **source text**, presentation = an
|
||||
**executable render** with **no cached output**; non-Markdown executable content.
|
||||
4. **Live/temporal/generative content** (Strudel) — source canonical, render irreducibly
|
||||
live; static = source + a marked recording.
|
||||
|
||||
All four reduce to **(source, derivation rule, projection with provenance + liveness)** — one
|
||||
model, four positions.
|
||||
|
||||
## 4. Provenance & reproducibility (the honest weakness)
|
||||
|
||||
Computed output provenance is **real but fragile** everywhere: Jupyter/Mathematica
|
||||
`execution_count`/`In`-`Out` can be **out-of-order**; **environment/versions/data are not
|
||||
captured**; Strudel/Processing may be **non-deterministic**. Implication for the contract:
|
||||
treat a computed output as a **snapshot with declared, incomplete provenance** (run id, source
|
||||
rev, timestamp; environment "unguaranteed"), reusing **one snapshot-provenance machinery**
|
||||
across notebooks, recordings, and renders (UC-84 is the template). This is consistent with
|
||||
shard-wiki's existing "surface freshness/completeness, never imply more than you have"
|
||||
(Oddmuse partial-history UC-82).
|
||||
|
||||
## 5. The recommendation
|
||||
|
||||
**Does shard-wiki need an executable/computational content capability? — Yes, but only as
|
||||
recognition + projection + capability-gating, never as an execution engine.**
|
||||
|
||||
1. **Adopt the source/derivation/projection model** (no execution required). shard-wiki
|
||||
**recognizes** computational content types (literate source, notebook, sketch, pattern),
|
||||
**attaches the canonical source**, and **presents derived forms as projections with
|
||||
provenance + liveness markers**. This alone delivers UC-83/UC-84 and the enrichment of
|
||||
UC-54/55 — and needs **no kernel, no sandbox, no runtime**.
|
||||
2. **Make execution a capability, off by default.** "Drive a derivation" (run tangle/weave,
|
||||
re-execute a notebook, render a sketch, evaluate a pattern in the viewer) is a **gated
|
||||
capability** with a **trust/sandbox** sub-concern (T11). Absent it, **degrade to the
|
||||
captured snapshot / static render / recording** — the graceful-degradation rule, which the
|
||||
batch shows always has an honest fallback (source is tiny and diffable everywhere).
|
||||
3. **One projection model, two axes** (T16): projection-kind (replication vs derivation; with
|
||||
timing + multiplicity facets) × liveness (live↔snapshot). The **moldable view registry**
|
||||
(T7) is the unifying structure — an open, type-keyed set of co-equal projections, none
|
||||
canonical-by-fact (display-canonical is policy).
|
||||
4. **One snapshot-provenance record** reused for notebook outputs, renders, and recordings
|
||||
(run id, source rev, timestamp, environment "unguaranteed").
|
||||
5. **Hard boundaries** (design-bugs if violated): never host a kernel/runtime as the store;
|
||||
**image-is-not-a-store** (attach exported files); never present a derivation without
|
||||
output→source provenance; never imply a static view captures a live artifact.
|
||||
|
||||
**Net:** computational content is **in scope as a page-model + projection concern**, **out of
|
||||
scope as an execution platform** — exactly the mechanism-over-policy, capability-aware,
|
||||
degradable posture INTENT already mandates. No INTENT amendment is required; this extends the
|
||||
page model and projection model within existing constraints.
|
||||
|
||||
## 6. Fold into SHARD-WP-0002
|
||||
|
||||
- **T12 (page model):** add the four computational shapes (§3); allow nestable cells and
|
||||
structured re-evaluable outputs; "derived output may live inside the source" (notebook).
|
||||
- **T16 (projection):** the **two-axis model** (§2) + the **moldable view registry** (§3/T7);
|
||||
materialization-timing, multiplicity, continuity, and the live↔snapshot far end as explicit
|
||||
projection metadata.
|
||||
- **T11 (capabilities):** "derive/execute/render/evaluate" as gated capabilities with trust/
|
||||
sandbox; default off → snapshot.
|
||||
- **T15 (fidelity):** non-Markdown executable/computed content; lossy renders; the
|
||||
structured-re-evaluable-value point on the content-opacity spectrum.
|
||||
- **T13 (history):** paired-text (Jupytext) / cell-aware (nbdime) strategies for embedded-
|
||||
output documents; outputs-as-derived (nbstripout ethos).
|
||||
- **T14 (binding):** **image-is-not-a-store** boundary (export→files only).
|
||||
|
||||
## 7. Open questions (escalated)
|
||||
|
||||
1. Is **liveness** (and "irreducibly live / no faithful static form") an explicit first-class
|
||||
metadata flag on every projection, so the union renders the honest fallback automatically?
|
||||
(T5/T6 far-end question.)
|
||||
2. Does shard-wiki **ever drive a derivation** (sandboxed), or strictly attach + present
|
||||
snapshots? (Recurs UC-56/UC-83/UC-84/T4/T5 — a single capability/trust policy decision.)
|
||||
3. Is a computed output's **structured re-evaluable value** (Mathematica/Wolfram) modeled as a
|
||||
typed value or stored opaquely with provenance? (UC-55 open-Q #10; UC-84 Q3.)
|
||||
4. Should **UC-83** and **UC-84** eventually merge as two positions of one "source +
|
||||
derivations" shape, or stay distinct? (Kept distinct: UC-84's defining trait is *output
|
||||
embedded in source with weak provenance*; UC-83's is *N co-equal external derivations*.)
|
||||
|
||||
## 8. Sources
|
||||
|
||||
The eight SHARD-WP-0004 dives: `research/260614-{literate-programming,mathematica,jupyter,
|
||||
processing,strudel,squeak-pharo,glamorous-toolkit}-deep-dive/`. Prior projection/structure
|
||||
anchors: `research/260614-{ikiwiki,logseq,zigzag}-deep-dive/`,
|
||||
`research/260614-shard-spectrum-synthesis/`.
|
||||
|
||||
## 9. Traceability
|
||||
|
||||
No new UC (consolidation). Consolidates **UC-83, UC-84** and enrichments to **UC-32, UC-44,
|
||||
UC-47, UC-48, UC-54, UC-55, UC-79, UC-37, UC-35**. Feeds **SHARD-WP-0002 T11/T12/T13/T14/T15/
|
||||
T16** (see §6). Recommendation: computational content is **in scope as page-model + projection
|
||||
mechanism, out of scope as an execution platform**; no INTENT amendment required.
|
||||
20
research/260614-federated-wiki-deep-dive/README.md
Normal file
20
research/260614-federated-wiki-deep-dive/README.md
Normal file
@@ -0,0 +1,20 @@
|
||||
# 260614 — Federated Wiki deep dive
|
||||
|
||||
Deep dive on Ward Cunningham's **Federated Wiki** (Smallest Federated Wiki / SFW,
|
||||
2011 →) as a **federation model** rather than a single shard: fork-with-provenance,
|
||||
the per-page **JSON journal** of semantic actions, the **story** of typed items, the
|
||||
**neighborhood/roster** discovery model, and time-bounded **happenings**.
|
||||
|
||||
This is prior art for shard-wiki's **coordination layer itself** — the closest existing
|
||||
system to "a union of pages preserving provenance, assembled non-destructively." It
|
||||
extends `research/260608-federation-concepts/` §3 with the concrete data model + protocol.
|
||||
|
||||
- `findings.md` — full writeup: data model, journal/action types, federation protocol,
|
||||
capability profile, INTENT mapping, UC seeds (UC-70–UC-72), architecture notes for
|
||||
SHARD-WP-0002, open questions, sources, traceability.
|
||||
|
||||
Catalog yield: UC-70 (attach a fedwiki site via page-JSON + CORS), UC-71 (append-only
|
||||
semantic action journal with site provenance as a coordination-journal model), UC-72
|
||||
(fork-with-site-provenance federation across a neighborhood / chorus). Enriched
|
||||
UC-26/28/30/05/27. Feeds SHARD-WP-0002 T1–T5 (federation) and T11/T13/T16 (write
|
||||
granularity, log-based merge, identity≠placement).
|
||||
240
research/260614-federated-wiki-deep-dive/findings.md
Normal file
240
research/260614-federated-wiki-deep-dive/findings.md
Normal file
@@ -0,0 +1,240 @@
|
||||
# Federated Wiki — deep dive (findings)
|
||||
|
||||
**Date:** 2026-06-14 · **Source:** SHARD-WP-0003 T1 · **Subject:** Ward Cunningham's
|
||||
Smallest Federated Wiki (SFW) / Federated Wiki (fedwiki ecosystem).
|
||||
|
||||
## Why this dive
|
||||
|
||||
Every prior dive has been a *shard candidate* — a store we might attach. Federated Wiki
|
||||
is different: it is a **federation model**, the one piece of public prior art whose core
|
||||
job is the same as shard-wiki's coordination layer — *present a union of pages from many
|
||||
independent sites while preserving where each came from, and let people copy and edit
|
||||
non-destructively*. Ward Cunningham (inventor of the wiki) built SFW in 2011 precisely to
|
||||
fix the original wiki's single-canonical-page weakness with **fork + provenance**. We go
|
||||
past the surface (`260608-federation-concepts/` §3) into the data model and protocol, then
|
||||
ask what shard-wiki should adopt.
|
||||
|
||||
**Framing:** fedwiki is not just "a shard we attach" — it is a *worked example of the
|
||||
coordination journal, overlay-before-mutation, and union-without-erasure*, three of our
|
||||
own design pillars, shipped and running.
|
||||
|
||||
---
|
||||
|
||||
## 1. The data model — page = title + story + journal
|
||||
|
||||
A fedwiki page is a small JSON object with three core fields (plus optional decoration):
|
||||
|
||||
```json
|
||||
{
|
||||
"title": "Welcome Visitors",
|
||||
"story": [
|
||||
{ "type": "paragraph", "id": "7b56f22a4b9ee974",
|
||||
"text": "Welcome to this [[Federated Wiki]] site." },
|
||||
{ "type": "image", "id": "a1c0e3...", "url": "...", "caption": "..." }
|
||||
],
|
||||
"journal": [
|
||||
{ "type": "create", "id": "7b56f22a4b9ee974", "item": {...}, "date": 1310000000000 },
|
||||
{ "type": "add", "id": "a1c0e3...", "item": {...}, "after": "7b56f22a4b9ee974",
|
||||
"date": 1310000100000 },
|
||||
{ "type": "edit", "id": "7b56f22a4b9ee974", "item": {...}, "date": 1310000200000 },
|
||||
{ "type": "fork", "site": "ward.fed.wiki.org", "date": 1310000300000 }
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
- **story** — an *ordered array of typed items* ("paragraph-like" items). Each item is
|
||||
`{ type, id, text, ...type-specific }`. The **`id`** is a random 16-hex string,
|
||||
**stable across edits** (it is the unit of identity within a page). The **`type`** names
|
||||
the **plugin** that renders/edits the item (`paragraph`, `image`, `html`, `markdown`,
|
||||
`code`, `method`, `pagefold`, chart plugins, …). *Data lives in the item; behavior lives
|
||||
in the plugin* — the item is portable JSON; the plugin is the renderer.
|
||||
- **journal** — an *ordered, append-only array of action objects* that, when replayed,
|
||||
**reconstructs the story**. The story is a materialized view of the journal. This is the
|
||||
key architectural choice: **the journal is the source of truth, the story is derived.**
|
||||
|
||||
## 2. Journal action types — a semantic op-log
|
||||
|
||||
Each journal entry is an action with `{ type, ... , date }` (epoch-ms). The action types:
|
||||
|
||||
| action | fields | meaning |
|
||||
|---------|--------|---------|
|
||||
| `create`| `id, item, date` | first item — page born |
|
||||
| `add` | `id, item, after, date` | insert an item after another |
|
||||
| `edit` | `id, item, date` | replace an item's content (id preserved) |
|
||||
| `move` | `order, date` | reorder items |
|
||||
| `remove`| `id, date` | delete an item |
|
||||
| `fork` | `site, date` | **mark that the page was copied from `site` at this point** |
|
||||
|
||||
Two things matter for us:
|
||||
|
||||
1. **These are *semantic* operations** (add/move/edit/remove a paragraph), not text diffs
|
||||
and not character-level CRDT ops. The write granularity is the **story item
|
||||
(paragraph)** — a *middle* granularity between whole-file (TiddlyWiki) and
|
||||
block/character (Logseq/CRDT). It is an **op-log** like a CRDT, but the ops are
|
||||
coarse-grained and **applied by humans via fork**, not auto-merged.
|
||||
2. **`fork` is the provenance primitive.** When you copy a remote page to your own site,
|
||||
a `fork` entry is appended recording the **source site** and time. The journal of a
|
||||
forked page therefore **serializes a directed acyclic graph (DAG)** of where content
|
||||
came from — "the journal of a forked page is detailed enough to recognize where in the
|
||||
journal of the original the fork took place" (CouchDB-style per-entry sequence numbers
|
||||
make the cut-point identifiable). History visualization highlights the forked entry.
|
||||
|
||||
## 3. The federation protocol — sites, neighborhood, roster
|
||||
|
||||
- **Site** = an independent server (originally Node.js; also static-file and serverless
|
||||
variants). A site owns a set of pages, each served as **page JSON over HTTP** at
|
||||
`/<slug>.json`, with **CORS headers** so a *browser-side* client can fetch pages from
|
||||
**any** site. Page identity within a site is the **slug** (a title-derived kebab name).
|
||||
- **The client assembles the union, not the server.** The fedwiki client ("the lineup")
|
||||
renders pages **side by side**: clicking a link opens that page *from whatever site it
|
||||
resolves against*, appended to the right. Browsing literally builds a left-to-right
|
||||
trail across sites.
|
||||
- **Neighborhood** = the dynamic set of sites encountered in the current session (from the
|
||||
sites of pages you've opened, links, and forks). **Search runs across the neighborhood**
|
||||
— a federated search over exactly the sites you've touched.
|
||||
- **Roster** = an explicit, authored list of sites to include (a curated neighborhood);
|
||||
"sister sites" are peers you watch. There is **no central registry** — discovery is by
|
||||
link, fork, and roster.
|
||||
- **Happenings** = time-bounded collaborative events where many participants fork around a
|
||||
topic for a period, producing a burst of related forks (a bounded collaboration that
|
||||
leaves a durable forked record on each participant's own site).
|
||||
|
||||
## 4. The editorial model — fork, don't edit-in-place
|
||||
|
||||
You can only write to **your own** site. To change someone else's page you **fork** it
|
||||
(copy into your site, journal records the source), then edit your copy. Many forks of the
|
||||
same page coexist across sites — Cunningham's **"chorus of voices"**: *no canonical
|
||||
version*, divergence is normal and visible, and you choose whose changes to pull by forking
|
||||
them. There is **no automatic merge** — reconciliation is human: compare journals, fork the
|
||||
version you prefer, optionally re-fork upstream changes.
|
||||
|
||||
---
|
||||
|
||||
## 5. Capability profile
|
||||
|
||||
| Dimension (synthesis spectrum) | Federated Wiki |
|
||||
|--------------------------------|----------------|
|
||||
| Attachment mode | **REST/file-store hybrid** — page JSON over HTTP+CORS; also static files |
|
||||
| Addressing granularity | **story item (paragraph)** via stable 16-hex `id` |
|
||||
| Content identity | item `id` random+stable; page id = site + slug |
|
||||
| Identity vs placement | **placement-bound**: identity = `site` + `slug`; forks are *new* identities linked by journal provenance |
|
||||
| Structure | ordered array of **typed items** (plugin-typed) |
|
||||
| History | **per-page append-only journal** of semantic actions (op-log) |
|
||||
| Merge model | **fork + manual journal compare** — a *third model* beside git 3-way and CRDT auto-merge |
|
||||
| Native query | none built-in; **neighborhood search** (federated full-text across touched sites) |
|
||||
| Translation | item `text` is wiki/Markdown-ish; plugins own their formats |
|
||||
| Attachment/write granularity | **story-item level** (add/edit/move/remove one item) |
|
||||
| Operational envelope | tiny servers, browser-driven; CORS is the whole API surface |
|
||||
| Access grant | **own-site-only writes**; reads open via CORS |
|
||||
| Content opacity | transparent JSON (no E2EE); plugin-typed but inspectable |
|
||||
| Provenance | **first-class** — `fork` records source site; journal = provenance DAG |
|
||||
|
||||
## 6. INTENT mapping
|
||||
|
||||
### Reinforcements (fedwiki validates our pillars)
|
||||
|
||||
- **Coordination journal** (INTENT) ≈ fedwiki **journal**. Our journal idea is *exactly*
|
||||
fedwiki's per-page append-only action log — and fedwiki proves the story-as-derived-view
|
||||
pattern works. Strong reinforcement; adopt the **semantic-op + provenance-entry** shape.
|
||||
- **Overlay before mutation** ≈ **fork**. Fork *is* the canonical overlay: a
|
||||
non-destructive copy onto a writable surface, recording provenance, before any change.
|
||||
- **Union without erasure** ≈ **neighborhood + chorus**. The union is assembled from many
|
||||
sovereign sites; provenance (which site, forked-from) is never hidden; divergence is
|
||||
surfaced, not resolved away.
|
||||
- **No silent remote mutation** ≈ **own-site-only writes**. You structurally *cannot*
|
||||
mutate a remote; you fork to your own site. This is our rule, enforced by architecture.
|
||||
- **Mechanism over policy** ≈ **no canonical source**. Fedwiki ships the mechanism (fork,
|
||||
journal, neighborhood) and leaves "which version wins" entirely to people.
|
||||
- **Graceful degradation** ≈ static-file sites — a fedwiki site can be a read-only pile of
|
||||
JSON files; still forkable, still in the neighborhood.
|
||||
|
||||
### Divergences (boundaries / design notes, not bugs)
|
||||
|
||||
- **Identity = placement.** Fedwiki page identity is `site` + `slug`; a fork is a *new*
|
||||
page whose only tie to the origin is a journal `fork` entry. shard-wiki wants
|
||||
**identity ≠ placement** (the "same" page across shards under a stable identity, T16) —
|
||||
so we treat fedwiki's journal-linked forks as *provenance edges*, and layer our own
|
||||
cross-shard identity over them rather than adopting slug-as-identity.
|
||||
- **No query / no typed-record model.** Fedwiki is paragraphs+plugins, not a typed DB
|
||||
(contrast Notion/Wikibase). Fine — it sits at the *coordination* end, not the structure
|
||||
end. We don't ask fedwiki to provide query; the neighborhood search is the model for
|
||||
*federated* search across shards (T-federation), not in-shard query.
|
||||
- **Browser-assembles-union.** Fedwiki pushes union assembly to the client. shard-wiki
|
||||
assembles server/orchestrator-side. Adopt the *model* (union from sovereign sources +
|
||||
provenance), not the client-only locus.
|
||||
|
||||
### What to keep
|
||||
|
||||
1. **Journal = append-only semantic-op log with provenance entries**, story = derived
|
||||
replay view. This is the concrete shape for our coordination journal (T13).
|
||||
2. **Fork-with-source-attribution** as the overlay/adopt primitive across shards.
|
||||
3. **Neighborhood** as the model for a *dynamic, link-and-fork-discovered* federated set +
|
||||
search, with **roster** as the curated/explicit variant.
|
||||
4. **Chorus of forks** — represent divergent versions across shards as co-equal, linked by
|
||||
provenance, with reconciliation as an explicit human/policy step (mechanism over policy).
|
||||
|
||||
---
|
||||
|
||||
## 7. UC seeds
|
||||
|
||||
| # | Seed | Disposition |
|
||||
|---|------|-------------|
|
||||
| UC-70 | Attach a Federated Wiki site as a shard via its **page JSON + CORS** (REST/file-store hybrid); project pages, fork to overlay | **new** |
|
||||
| UC-71 | Adopt a **per-page append-only semantic-action journal with provenance entries** (fork=source site) as the coordination-journal model — replay to materialize, compare to locate divergence | **new** |
|
||||
| UC-72 | **Fork-with-site-provenance federation across a neighborhood** of peer shards — assemble a union from links/forks, search across it, preserve the chorus without forcing a canonical | **new** |
|
||||
| — | fork-with-provenance as overlay/adopt | enrich **UC-26** (fork) |
|
||||
| — | carry-forward of forked content + upstream re-fork | enrich **UC-28** (carry-forward) |
|
||||
| — | happenings = time-bounded collaboration leaving durable forks | enrich **UC-30** (time-bounded space) |
|
||||
| — | union/chorus of co-equal versions, provenance-linked | enrich **UC-05 / UC-27** |
|
||||
|
||||
## 8. Architecture notes for SHARD-WP-0002
|
||||
|
||||
- **T1–T5 (federation):** fedwiki is the reference design. The **journal** (append-only,
|
||||
semantic ops, fork-provenance) is the concrete coordination-journal shape; **neighborhood
|
||||
+ roster** is the discovery/membership model (dynamic vs curated); **fork** is the
|
||||
overlay/adopt op. Model the union as an assembly over sovereign sources with provenance
|
||||
edges, reconciliation left to policy.
|
||||
- **T11 (capability/write-granularity):** add **story-item / paragraph** as a named
|
||||
write-granularity tier between whole-file and block/character.
|
||||
- **T13 (history portability / merge model):** record fedwiki's **journal-replay op-log**
|
||||
as a *third merge model* beside git 3-way and CRDT auto-merge — a **coarse semantic
|
||||
op-log applied manually via fork**. A shard whose history *is* such a journal can supply
|
||||
our coordination journal almost directly (vs git-commit import or CRDT-update import).
|
||||
- **T16 (identity ≠ placement):** fedwiki's `fork` journal entries are **provenance edges**
|
||||
between same-named pages on different sites — exactly the cross-shard "same page,
|
||||
different placement" relation we must model. Use them as edges; keep our own identity
|
||||
layer above slug.
|
||||
|
||||
## 9. Open questions
|
||||
|
||||
1. Should shard-wiki's coordination journal adopt fedwiki's **exact action vocabulary**
|
||||
(create/add/edit/move/remove/fork) at the page-item level, or a more granular/abstract
|
||||
op set that other shards can also emit?
|
||||
2. Is **neighborhood** (dynamic, link/fork-discovered) a first-class membership mode for an
|
||||
information space, or only a *view* over an explicitly-configured shard set (roster)?
|
||||
3. How do we reconcile fedwiki's **slug-as-identity + fork-DAG** with our intended
|
||||
**stable cross-shard identity** (T16) — promote fork edges into the identity graph, or
|
||||
keep them as provenance-only annotations?
|
||||
4. Does the **chorus / no-canonical** stance compose with shards that *do* assert a
|
||||
canonical (Notion, an upstream git main)? (policy-selectable canonical over a
|
||||
mechanism that permits chorus.)
|
||||
|
||||
## 10. Sources
|
||||
|
||||
- Smallest Federated Wiki wiki: **Story JSON**, **Federation Details** —
|
||||
github.com/WardCunningham/Smallest-Federated-Wiki/wiki
|
||||
- JSON Schema notes — song.fed.wiki.org/json-schema.html
|
||||
- "Smallest Federated Wiki" — home.c2.com/smallest-federated-wiki.html
|
||||
- Federated Wiki — federated.wiki (Visualizing Page History)
|
||||
- Mike Caulfield, "The OER Case for Federated Wiki" — hapgood.us (2015)
|
||||
- Jon Udell, "A federated Wikipedia" — blog.jonudell.net (2015)
|
||||
- Wikipedia: *Federated Wiki*; IndieWeb: *Smallest Federated Wiki*
|
||||
- fedwiki/wiki-plugin-transport (plugin/transport reference)
|
||||
- prior: `research/260608-federation-concepts/` §3
|
||||
|
||||
## 11. Traceability
|
||||
|
||||
New UCs **UC-70–UC-72** carry the marker **⊞** in the wikiengines column of
|
||||
`spec/UseCaseCatalog.md` (true lineage = this dive; placed in the nearest existing column).
|
||||
Enriched: UC-26, UC-28, UC-30, UC-05, UC-27. Architecture cross-refs: SHARD-WP-0002
|
||||
T1–T5, T11, T13, T16.
|
||||
17
research/260614-forge-wikis-deep-dive/README.md
Normal file
17
research/260614-forge-wikis-deep-dive/README.md
Normal file
@@ -0,0 +1,17 @@
|
||||
# 260614 — git-forge wikis deep dive (Gitea · GitLab · GitHub)
|
||||
|
||||
Deep dive on the **git-forge-hosted Markdown wikis** as one grouped memo: each is a
|
||||
**dedicated git repository of Markdown** exposed through a forge, attachable by **cloning
|
||||
the wiki repo directly** *or* (where offered) through the **forge's wiki API**. INTENT names
|
||||
**Gitea wikis** explicitly as a shard participant — this dive confirms the **git-native
|
||||
file-store** as a first-class, and the *simplest*, backend.
|
||||
|
||||
- `findings.md` — the three forges compared, the `.wiki.git` model, API matrix, capability
|
||||
profile, INTENT mapping, UC seeds (UC-76/77), architecture notes for SHARD-WP-0002, open
|
||||
questions (resolves the UC-68 source-of-truth question for this case), sources,
|
||||
traceability.
|
||||
|
||||
Catalog yield: UC-76 (attach a forge wiki by cloning its dedicated `.wiki.git` — git **is**
|
||||
the native store and the coordination journal), UC-77 (attach/write via the forge wiki API
|
||||
where git-clone is unavailable/undesired — capability varies by forge). Enriched
|
||||
UC-40/02/68/38. Feeds SHARD-WP-0002 T14 (attachment binding).
|
||||
173
research/260614-forge-wikis-deep-dive/findings.md
Normal file
173
research/260614-forge-wikis-deep-dive/findings.md
Normal file
@@ -0,0 +1,173 @@
|
||||
# git-forge wikis (Gitea · GitLab · GitHub) — deep dive (findings)
|
||||
|
||||
**Date:** 2026-06-14 · **Source:** SHARD-WP-0003 T5 · **Subject:** the Markdown wikis
|
||||
hosted by the three major git forges — **Gitea**, **GitLab**, **GitHub** — treated as one
|
||||
family because they share one architecture: *a wiki is a separate git repo of Markdown.*
|
||||
|
||||
## Why this dive
|
||||
|
||||
INTENT names **Gitea wikis** as a shard participant, and the whole project is "a **Git-based
|
||||
Markdown** wiki orchestrator." The forge wikis are therefore the **least exotic, highest-
|
||||
fit** backend in the entire study: the page store is *literally a git repository of Markdown
|
||||
files*. After fourteen dives into DBs, CRDTs, graphs and SaaS, this one confirms the
|
||||
**home case** — and sharpens it by contrasting *git-IS-the-store* (forge wikis) against
|
||||
*git-is-a-mirror* (Wiki.js, UC-68).
|
||||
|
||||
## 1. The shared architecture — a wiki is a `.wiki.git` repo
|
||||
|
||||
All three forges implement a project/repo wiki as a **second, dedicated git repository**
|
||||
alongside the code repo, addressable as `<repo>.wiki.git`:
|
||||
|
||||
- `git@host:owner/project.wiki.git` (GitLab), `…/owner/repo.wiki.git` (Gitea/GitHub).
|
||||
- **Pages are Markdown files** (`Home.md`, `Some-Page.md`), one file per page; the **page
|
||||
title ↔ filename** (spaces ↔ hyphens by convention). Other markups are accepted
|
||||
(AsciiDoc, Textile, reStructuredText, Org) — GitHub/Gitea via **Gollum** (the Ruby
|
||||
git-backed wiki library), GitLab via its own renderer.
|
||||
- **History is git history** — every page edit (web or pushed) is a git commit with
|
||||
author/timestamp/message. *The wiki's revision history is a real git log.*
|
||||
- **Special pages** by convention: `_Sidebar`, `_Footer`, `_Header` (GitHub/Gitea),
|
||||
`_sidebar` (GitLab) — engine-rendered chrome stored as ordinary files.
|
||||
- **Subdirectories / nested pages**: GitLab and Gitea support directory structure; GitHub
|
||||
wikis are historically flat (Gollum supports paths but the GitHub UI is shallow).
|
||||
|
||||
The decisive property: **you can `git clone` the wiki repo, edit files, commit, and push**,
|
||||
and the forge UI reflects it — *and vice versa*. Git is **a** (often **the**) first-class
|
||||
write path. This is exactly shard-wiki's native medium with no impedance.
|
||||
|
||||
## 2. Where they differ — the API matrix
|
||||
|
||||
| | git clone/push of `.wiki.git` | wiki content **API** | nested dirs | markups |
|
||||
|--|--|--|--|--|
|
||||
| **Gitea** | ✅ yes | ✅ **REST wiki endpoints** (list/get/create/edit/delete pages) | ✅ | Markdown (+Gollum-style) |
|
||||
| **GitLab** | ✅ yes | ✅ **REST Wikis API** (project & group wikis) | ✅ | Markdown/AsciiDoc/RDoc/Org |
|
||||
| **GitHub** | ✅ yes | ❌ **no wiki REST API** — wiki is **git-only** (Gollum) | ⚠️ flat UI | Markdown + Gollum markups |
|
||||
|
||||
The key asymmetry: **GitHub exposes wiki content *only* through git** (the REST/GraphQL API
|
||||
covers issues/PRs/code but **not** wiki pages); **GitLab and Gitea offer both** a wiki API
|
||||
*and* git access. So the **git-clone path is the universal one** (works for all three); the
|
||||
API path is an *optional, capability-varying* alternative.
|
||||
|
||||
## 3. git-IS-the-store vs git-is-a-mirror (the UC-68 contrast)
|
||||
|
||||
Wiki.js (UC-68) keeps a **DB as canonical** and *maintains a git mirror* — so writing by
|
||||
commit risks **racing the engine's DB↔git sync** (catalog open-Q22). Forge wikis are the
|
||||
opposite: **the git repo IS the canonical store**; there is *no* separate DB of record for
|
||||
wiki content. Therefore:
|
||||
|
||||
- **The source-of-truth question (Q22) is resolved for this case:** the `.wiki.git` repo is
|
||||
authoritative. shard-wiki can **write by commit/push directly** with no engine to race —
|
||||
the forge merely *renders* what git holds.
|
||||
- The forge **API** (GitLab/Gitea), where present, is a *convenience over the same git
|
||||
repo*, not a competing store — so API-write and git-write converge on one history.
|
||||
|
||||
This makes forge wikis the **cleanest possible write-through file-store shard**: clone =
|
||||
projection/mirror, commit = overlay-applied/write, git log = the coordination journal *as
|
||||
is*.
|
||||
|
||||
## 4. Capability profile
|
||||
|
||||
| Dimension (synthesis spectrum) | Gitea / GitLab / GitHub wiki |
|
||||
|--------------------------------|------------------------------|
|
||||
| Attachment mode | **file-store (native: git clone)** + optional **external-API** (GitLab/Gitea wiki REST) |
|
||||
| Addressing granularity | **page = file**; sub-page = path (GitLab/Gitea) |
|
||||
| Content identity | path/filename within the wiki repo (title-derived) |
|
||||
| Identity vs placement | placement-bound (path = identity), like a plain git repo |
|
||||
| Structure | flat or directory tree of Markdown files; `_Sidebar`/`_Footer` chrome |
|
||||
| History | **native git history** (real commits, authors, messages) |
|
||||
| Merge model | **git** (3-way merge, branches) — though wiki repos are usually single-branch |
|
||||
| Native query | none (it's files); forge full-text search over the wiki |
|
||||
| Translation | **Markdown-native** (+ AsciiDoc/Org via renderer) — minimal/no translation needed |
|
||||
| Attachment/write granularity | **file (page)** per commit |
|
||||
| Operational envelope | ordinary git + forge; clone is cheap; API rate limits apply to API path |
|
||||
| Access grant | **forge repo permissions** (delegated auth; per-repo/role ACL) |
|
||||
| Content opacity | transparent Markdown in git |
|
||||
| Provenance | git author/committer/timestamp per commit — native |
|
||||
|
||||
## 5. INTENT mapping
|
||||
|
||||
### Reinforcements (this is the home case)
|
||||
|
||||
- **Git-based Markdown orchestrator** (INTENT core): forge wikis *are* git repos of
|
||||
Markdown. The **wiki page model** (Markdown-first, path-addressed, git-versioned) maps 1:1
|
||||
— minimal adapter, maximal fit.
|
||||
- **Coordination journal = git** (INTENT): the wiki repo's **git log is already the
|
||||
coordination journal** — no synthesis needed; adopt it directly.
|
||||
- **Overlay before mutation**: overlays are **branches/commits** on the cloned wiki repo;
|
||||
applying = push (or open an MR/PR where the forge supports wiki MRs — GitLab does not for
|
||||
wikis, so push-to-branch + manual is the path).
|
||||
- **Graceful degradation**: even GitHub (no wiki API) is fully usable via git-clone — the
|
||||
*universal* path means a limited forge is still a first-class read/write shard.
|
||||
- **No silent remote mutation**: writes are explicit git pushes (or explicit API calls)
|
||||
under the user's forge credentials and repo permissions.
|
||||
|
||||
### Divergences (boundaries / notes — minor)
|
||||
|
||||
- **Capability varies by forge**: GitHub = git-only (no content API); GitLab/Gitea = git +
|
||||
API. The adapter must **model the API as an optional capability**, defaulting to the
|
||||
universal git path (T11/T14). Not a bug — exactly the capability-awareness INTENT mandates.
|
||||
- **Wiki repos rarely use branches/MRs for review**: forge wikis usually edit a single
|
||||
branch directly; the rich PR-review flow is on the *code* repo, not the wiki. So
|
||||
"overlay → review → merge" needs shard-wiki to provide the review layer, not the forge.
|
||||
- **Identity = path** (like any git repo) — cross-shard identity (T16) is layered above, as
|
||||
for plain git/`wiki/` subdir shards.
|
||||
|
||||
### What to keep
|
||||
|
||||
1. **git-clone as the universal, canonical file-store attach** for forge wikis — Markdown +
|
||||
git history directly as page model + coordination journal (UC-76). The reference
|
||||
easy-case backend.
|
||||
2. **Forge wiki API as an optional capability** (GitLab/Gitea), with **git-only fallback**
|
||||
(GitHub) — capability-aware binding (UC-77).
|
||||
3. **git-IS-store ⇒ write-by-commit is safe** (no engine race) — record this as the
|
||||
resolution of the Wiki.js mirror dilemma (Q22) for forge wikis.
|
||||
|
||||
## 6. UC seeds
|
||||
|
||||
| # | Seed | Disposition |
|
||||
|---|------|-------------|
|
||||
| UC-76 | Attach a **git-forge wiki** by **cloning its dedicated `.wiki.git`** — git is the native store; Markdown files = pages, git log = coordination journal; commit/push = write (no engine to race) | **new** |
|
||||
| UC-77 | Attach/write a forge wiki via the **forge's wiki API** (GitLab/Gitea REST) where git-clone is unavailable or API-write is preferred; **git-only fallback** for GitHub — capability varies by forge | **new** |
|
||||
| — | git-native file-store as the *canonical store* (not mirror) | enrich **UC-40** |
|
||||
| — | dual-path attach (git clone vs forge API) | enrich **UC-02** |
|
||||
| — | git-IS-store vs engine-maintained mirror (resolves Q22) | enrich **UC-68** |
|
||||
| — | forge as an API host for the wiki resource | enrich **UC-38** |
|
||||
|
||||
## 7. Architecture notes for SHARD-WP-0002
|
||||
|
||||
- **T14 (adapter binding / attach path):** forge wikis are the canonical **file-store
|
||||
attach** — bind to the `.wiki.git` clone as the universal path; model the **wiki API as an
|
||||
optional, forge-specific capability** (present: GitLab, Gitea; absent: GitHub). One shard,
|
||||
two possible bindings converging on the same git history.
|
||||
- **T11 (capability model):** "has-content-API" is a **per-forge capability flag**; git
|
||||
clone/push is the baseline every forge satisfies. Minimal adapter profile — near the
|
||||
Oddmuse-simple end but Markdown-native and git-versioned.
|
||||
- **Coordination journal:** adopt the wiki repo's **git log directly** — the one backend
|
||||
where INTENT's git-backed journal needs *zero* synthesis.
|
||||
- **Resolves Q22 (UC-68):** because git **is** the store (not a mirror), **write-by-commit
|
||||
is safe** — no engine DB↔git sync to race. Record the distinction *engine-mirror*
|
||||
(Wiki.js: DB canonical, careful) vs *git-canonical* (forge wikis: commit freely).
|
||||
|
||||
## 8. Open questions
|
||||
|
||||
1. For overlay → **review** → apply, does shard-wiki supply the review layer over a forge
|
||||
wiki (which lacks wiki-MRs), e.g. via a branch + its own diff/approve, or push directly?
|
||||
2. When a forge offers **both** git and a wiki API (GitLab/Gitea), which does the adapter
|
||||
prefer by default — git (universal, full history) with API as a fallback for hosts where
|
||||
clone is disabled? (cf. UC-43 backend-swap under stable binding.)
|
||||
3. Should the **code-repo `wiki/` subdir** shard and the **forge wiki repo** shard share one
|
||||
adapter (both git+Markdown) with a "which repo / which path" parameter, or stay distinct?
|
||||
|
||||
## 9. Sources
|
||||
|
||||
- GitLab Docs — *Wiki* (separate git repo; web/git/API; `.wiki.git`) — docs.gitlab.com
|
||||
- Gitea — wiki via git clone + repository **wiki API**; forum/issue threads on
|
||||
`.wiki.git` clone (go-gitea/gitea #1426, #15420) — gitea.com / github.com/go-gitea
|
||||
- GitHub — wiki = Gollum git repo (`<repo>.wiki.git`), no wiki REST API — docs.github.com
|
||||
- Gollum (git-based wiki library) — github.com/gollum/gollum
|
||||
- prior: `research/260614-wikijs-deep-dive/` (engine-maintained mirror contrast, UC-68)
|
||||
|
||||
## 10. Traceability
|
||||
|
||||
New UCs **UC-76–UC-77** carry the marker **⎇** in the wikiengines column of
|
||||
`spec/UseCaseCatalog.md`. Enriched: UC-40, UC-02, UC-68, UC-38. Architecture cross-refs:
|
||||
SHARD-WP-0002 T14, T11; coordination-journal-from-git; resolves catalog open-Q22.
|
||||
33
research/260614-glamorous-toolkit-deep-dive/README.md
Normal file
33
research/260614-glamorous-toolkit-deep-dive/README.md
Normal file
@@ -0,0 +1,33 @@
|
||||
# 260614 — Glamorous Toolkit (moldable development) deep dive
|
||||
|
||||
Date: 2026-06-14 · Source: **SHARD-WP-0004 T7**
|
||||
|
||||
## What this is
|
||||
|
||||
A deep dive into **Glamorous Toolkit** (GT, on Pharo): **moldable development** — cheap,
|
||||
custom, **domain-specific views** (`gtView` methods) so any object explains itself through an
|
||||
**open set of co-equal projections, none canonical** — plus **Lepiter**, GT's live notebook/
|
||||
knowledge base (git-versionable JSON page files with live, inspectable code results).
|
||||
|
||||
## Why it matters
|
||||
|
||||
- Strongest prior art for **moldable, multi-view projection**: projection is not *a* view
|
||||
but an **open, type-keyed set of co-equal, possibly-computed views, none privileged** —
|
||||
refines SHARD-WP-0002 **T16** and unifies replication-/derivation-/dimensional-/query-
|
||||
projection under "many co-equal views."
|
||||
- Generalizes ZigZag dimensional views (UC-47/48) and query/computed views (UC-54) into a
|
||||
**pluggable view registry** keyed by content type (answers UC-55's open question on a
|
||||
content-type registry).
|
||||
- Reinforces **files-canonical, liveness-above, degrade-to-snapshot** (Lepiter files vs the
|
||||
Pharo image; same boundary as Jupyter UC-84 / Squeak T6).
|
||||
|
||||
## Yield
|
||||
|
||||
- **No new UC** (design prior art, not a candidate shard — like the UseModWiki lineage dive).
|
||||
- Enrich **UC-47, UC-48, UC-54**; links **UC-55, UC-83, UC-84, UC-79**.
|
||||
|
||||
## Contents
|
||||
|
||||
| Path | Role |
|
||||
|------|------|
|
||||
| `findings.md` | Moldable development & `gtView`, the Moldable Inspector, Lepiter, relation to ZigZag/query/derivation projection, INTENT mapping, UC disposition (enrichment-only), architecture notes, open questions |
|
||||
161
research/260614-glamorous-toolkit-deep-dive/findings.md
Normal file
161
research/260614-glamorous-toolkit-deep-dive/findings.md
Normal file
@@ -0,0 +1,161 @@
|
||||
# Glamorous Toolkit — deep dive (findings)
|
||||
|
||||
**Date:** 2026-06-14 · **Source:** SHARD-WP-0004 T7 · **Subject:** Glamorous Toolkit (GT) on
|
||||
Pharo — moldable development, custom views/inspectors, the live notebook (Lepiter).
|
||||
|
||||
## Why this dive
|
||||
|
||||
T1/T3 gave us **derivation-projection** (one source → rendered/computed forms). GT comes at
|
||||
projection from the *other* side: **many co-equal, domain-specific views over the same live
|
||||
content**, where the *environment molds itself to the knowledge* rather than forcing the
|
||||
knowledge into one fixed rendering. This is the strongest prior art for **moldable,
|
||||
multi-view projection** and a striking parallel to ZigZag's dimensional model (UC-47/48) and
|
||||
to query/computed views (UC-54). It is *design* prior art — GT is not a candidate shard —
|
||||
so the yield is **enrichment + projection design notes**, not a new shard UC.
|
||||
|
||||
## 1. Moldable development
|
||||
|
||||
GT's thesis: **systems should be explainable through custom tools that are cheap to build.**
|
||||
Instead of a single generic object inspector/renderer, a developer adds **small,
|
||||
domain-specific views** to a class so any instance explains itself in the terms that matter:
|
||||
|
||||
- **`gtView`-annotated methods** — a class declares extra inspector views by writing methods
|
||||
tagged `<gtView>` that return a view (tree, list, table, chart, source, raw, a custom
|
||||
diagram…). Each is a **co-equal projection of the same object**; none is privileged.
|
||||
- The **Moldable Inspector** shows these views as **switchable tabs** over one object, and
|
||||
lets you **dive** into sub-objects (each with its own custom views) — navigation *is*
|
||||
moving across projections.
|
||||
- Views are **cheap and local**: a view is just a method, versioned with the code, added
|
||||
incrementally as understanding grows. The environment **adapts to the domain**.
|
||||
|
||||
The key abstraction for us: **the same underlying content carries an open, extensible set of
|
||||
views, selected at inspection time, none canonical.**
|
||||
|
||||
## 2. Lepiter — the live notebook / knowledge base
|
||||
|
||||
GT ships **Lepiter**, a notebook/knowledge base where pages mix prose, **live code
|
||||
snippets** (evaluated in-image, results inspectable with the moldable views above), and
|
||||
links. Notebook pages are stored as **JSON "database" files** on disk (a Lepiter DB =
|
||||
directory of page files), so the knowledge base is **file-backed and git-versionable** while
|
||||
remaining live in the image.
|
||||
|
||||
This is the literate/notebook pattern (T1/T3) fused with moldable views: a snippet's result
|
||||
is not a static captured output but a **live object you can open into any of its views** —
|
||||
the *anti-snapshot*. (Boundary: that liveness is exactly what shard-wiki must degrade to a
|
||||
snapshot when the image isn't present — see §4.)
|
||||
|
||||
## 3. Relationship to ZigZag, query-views, and the projection model
|
||||
|
||||
- **vs ZigZag (UC-47/48):** ZigZag gives **dimensional** views — the *same cells* seen along
|
||||
different orthogonal axes. GT gives **moldable** views — the *same object* seen through
|
||||
different *purpose-built lenses*. Both reject a single privileged rendering; both make
|
||||
**multi-view, none-canonical** the norm. GT generalizes the idea from fixed dimensions to
|
||||
an **open, code-defined view set**.
|
||||
- **vs query/computed views (UC-54):** a `gtView` is a **computed projection** (it runs code
|
||||
to build the view) — like a query-defined page, but keyed to a **content type / domain**
|
||||
rather than a stored query. Strengthens "a view can be computed, not stored."
|
||||
- **vs derivation-projection (T1/UC-83):** GT confirms projections are **plural and
|
||||
co-equal**; UC-83/UC-84 had *few* well-known derivations (docs/code/outputs), GT has an
|
||||
**open registry** of them keyed by type.
|
||||
|
||||
So GT's contribution to the contract is: **projection is not one view but an open set of
|
||||
co-equal, type-keyed, possibly-computed views, none canonical** — a *moldable projection
|
||||
registry*. That is a refinement of T16's projection model, not a new shard.
|
||||
|
||||
## 4. INTENT mapping
|
||||
|
||||
### Reinforcements
|
||||
|
||||
- **Union without erasure / no privileged rendering.** GT's "many co-equal views, none
|
||||
canonical" is the same ethos as showing provenance/freshness without hiding any: a page's
|
||||
presentation is **plural**, and shard-wiki should be able to offer **multiple co-equal
|
||||
projections of the same content** (raw, rendered, structured, domain-specific) rather than
|
||||
one flattened view.
|
||||
- **Computed views (UC-54) keyed by content type.** A moldable view = a **computed
|
||||
projection registered against a content type** — directly supports a *pluggable view/
|
||||
projection registry* in the contract (the answer shape for UC-55's "pluggable content-type
|
||||
registry" open question).
|
||||
- **Files-canonical, live-on-top.** Lepiter stores pages as **git-versionable JSON files**
|
||||
while being live in the image — reinforcing "the durable artifact is files; liveness is a
|
||||
layer above," consistent with shard-wiki's git-canonical stance.
|
||||
- **Mechanism over policy.** Which views to show, and whether to compute or snapshot them,
|
||||
stay configurable; GT provides the *mechanism* (open view set), not a fixed presentation.
|
||||
|
||||
### Divergences / boundaries
|
||||
|
||||
- **The image is not a store (shared boundary with T6 Squeak).** GT's *liveness* lives in a
|
||||
Pharo image; shard-wiki must not treat the image as a shard. Attach the **Lepiter DB files**
|
||||
(git-versionable) as the durable content; treat live/computed views as **derivation-
|
||||
projections that degrade to captured snapshots** when no image/kernel is present (same rule
|
||||
as Jupyter UC-84).
|
||||
- **Not a view engine.** shard-wiki *models* "this content type has these co-equal views and
|
||||
one may be canonical-for-display"; it does not implement GT's rendering. Domain-specific
|
||||
view code stays with the source/adapter, surfaced as a capability.
|
||||
|
||||
### What to keep
|
||||
|
||||
1. **Moldable projection = an open, type-keyed set of co-equal, possibly-computed views,
|
||||
none canonical** — refine T16's projection model and the UC-47/48/54 cluster toward a
|
||||
**pluggable view/projection registry**.
|
||||
2. **Lepiter** = literate/notebook (T1/T3) with **live results** stored as git-versionable
|
||||
files — another "files-canonical, liveness-above, degrade-to-snapshot" instance.
|
||||
3. **Navigation = moving across projections** (dive into sub-objects) — a navigation idea
|
||||
beside dimensional movement (UC-47/48) for the derived-views thread.
|
||||
|
||||
## 5. UC disposition (enrichment-only — no new shard UC)
|
||||
|
||||
| Mechanism (findings §) | Catalog UC |
|
||||
|------------------------|------------|
|
||||
| Many co-equal, domain-specific views over the same content; none canonical (§1, §3) | UC-47 / UC-48 (enriched) |
|
||||
| `gtView` = a **computed projection registered against a content type** (§1, §3) | UC-54 (enriched) |
|
||||
| Open, pluggable view set keyed by type = the shape of a content-type/view registry (§3) | links UC-55 (open-Q #10) |
|
||||
| Lepiter live-snippet results = live objects → degrade to snapshot when no image (§2, §4) | links UC-84, UC-83 |
|
||||
| Lepiter pages = git-versionable JSON files; image is not the store (§2, §4) | links UC-79 (files-canonical) |
|
||||
| Dive-into-sub-object navigation = moving across projections (§1) | links UC-17–UC-20 (derived views) |
|
||||
|
||||
GT is **design prior art, not a candidate shard**, so it yields **no new UC** (like the
|
||||
UseModWiki lineage dive). Its value is sharpening the **projection model** for SHARD-WP-0002
|
||||
T16: projection is an **open set of co-equal, type-keyed, possibly-computed views, none
|
||||
privileged** — a moldable view registry.
|
||||
|
||||
## 6. Architecture notes for SHARD-WP-0002
|
||||
|
||||
- **T16 (projection):** generalize projection from "a view" to a **moldable view registry** —
|
||||
an open, extensible set of **co-equal, type-keyed, possibly-computed** projections of the
|
||||
same content, none canonical (display-canonical is a policy choice, not a fact). This
|
||||
unifies replication-projection (UC-53/57), derivation-projection (UC-83/84), dimensional
|
||||
views (UC-47/48), and query/computed views (UC-54) under one "many co-equal views" model.
|
||||
- **T12 (page model):** a content type may **carry/declare its own views** (a capability);
|
||||
the page model should allow attaching view definitions to a type (the registry's entries),
|
||||
answering UC-55's "pluggable content-type registry" question.
|
||||
- **Boundary (T14):** attach the **Lepiter DB files** (durable, git-versionable), never the
|
||||
Pharo image; live/computed views are derivation-projections degrading to snapshots — same
|
||||
rule as Jupyter (no kernel/image host).
|
||||
|
||||
## 7. Open questions
|
||||
|
||||
1. Does shard-wiki expose a **view/projection registry** as a first-class public concept
|
||||
(content type → its co-equal views), or keep "moldable" as an adapter-internal idea?
|
||||
2. When multiple co-equal views exist, is **"canonical-for-display"** a per-shard policy, a
|
||||
user preference, or unset (always show the chooser)? (Mechanism-over-policy says
|
||||
configurable.)
|
||||
3. How does a **computed view** (UC-54/`gtView`) declare its **freshness/provenance** so the
|
||||
union doesn't present a stale computed projection as current? (Ties UC-84's snapshot
|
||||
marking.)
|
||||
|
||||
## 8. Sources
|
||||
|
||||
- Glamorous Toolkit docs (`gtoolkit.com`): moldable development, the Moldable Inspector,
|
||||
`gtView` methods, Lepiter knowledge base; Feenk essays on moldable development.
|
||||
- Pharo (substrate — see T8 dive): live image, reflective environment.
|
||||
- prior: `research/260614-zigzag-deep-dive/` (dimensional multi-view UC-47/48);
|
||||
`research/260614-jupyter-deep-dive/` (live vs snapshot, UC-84);
|
||||
`research/260614-literate-programming-deep-dive/` (derivation-projection, UC-83).
|
||||
|
||||
## 9. Traceability
|
||||
|
||||
**No new UC** (GT is design prior art, not a candidate shard). Enriched: UC-47, UC-48,
|
||||
UC-54; links UC-55 (content-type/view registry), UC-83/UC-84 (live→snapshot), UC-79
|
||||
(files-canonical), UC-17–UC-20 (derived-view navigation). Architecture cross-refs:
|
||||
SHARD-WP-0002 T16 (moldable view registry: open set of co-equal type-keyed computed views),
|
||||
T12 (content type declares its views), T14 (attach Lepiter files, not the image).
|
||||
15
research/260614-ikiwiki-deep-dive/README.md
Normal file
15
research/260614-ikiwiki-deep-dive/README.md
Normal file
@@ -0,0 +1,15 @@
|
||||
# 260614 — ikiwiki deep dive
|
||||
|
||||
Deep dive on **ikiwiki** (Joey Hess): a **wiki compiler** — it compiles a **VCS-backed**
|
||||
(usually **git**) tree of Markdown source into **static HTML**, supports **distributed**
|
||||
operation (clone/edit/push between wiki instances, change **pings**), and treats web edits as
|
||||
commits to the same repo.
|
||||
|
||||
- `findings.md` — the compiler model, git-distributed federation + pinger, static output,
|
||||
capability profile, INTENT mapping, UC seed (UC-79), architecture notes for SHARD-WP-0002,
|
||||
open questions, sources, traceability.
|
||||
|
||||
Catalog yield: UC-79 (attach a git-backed **compile-to-static** wiki — git Markdown source
|
||||
is the shard, compiled static HTML is a derived publish/projection; participate in
|
||||
git-distributed clone federation with change-pings). Enriched UC-31/56/37/33. Feeds
|
||||
SHARD-WP-0002 T4 (federation), T6 (publish/projection).
|
||||
153
research/260614-ikiwiki-deep-dive/findings.md
Normal file
153
research/260614-ikiwiki-deep-dive/findings.md
Normal file
@@ -0,0 +1,153 @@
|
||||
# ikiwiki — deep dive (findings)
|
||||
|
||||
**Date:** 2026-06-14 · **Source:** SHARD-WP-0003 T4 · **Subject:** ikiwiki, Joey Hess's
|
||||
VCS-backed wiki compiler.
|
||||
|
||||
## Why this dive
|
||||
|
||||
The forge-wiki dive (T5) established *git-IS-the-store* for hosted Markdown wikis. ikiwiki
|
||||
takes the same git-canonical source but adds two ideas shard-wiki cares about directly:
|
||||
**compile-to-static** (the wiki is *built*, not served from a DB) and **git-distributed
|
||||
federation** (wiki instances clone, pull, push, and **ping** each other). It is referenced
|
||||
in `research/260608-federation-concepts/`; here we go into the model.
|
||||
|
||||
## 1. The wiki-compiler model
|
||||
|
||||
ikiwiki is fundamentally a **compiler**: input is a directory of **source pages** (Markdown
|
||||
by default; also other formats) held in a **version-control repo**; output is a tree of
|
||||
**static HTML**. A rebuild is triggered by a **VCS post-commit/post-update hook**, so:
|
||||
|
||||
- **The VCS repo is canonical**; the HTML is *derived build output* (regenerable, disposable).
|
||||
- **Web edits are commits.** The CGI edit interface writes the change *into the repo* (a
|
||||
commit) and triggers a rebuild — so browser edits and `git push` edits converge on one
|
||||
history. (Same convergence as forge wikis, but here the canonical store is *your* git repo,
|
||||
not a forge's.)
|
||||
- **VCS-agnostic**: git is usual, but svn/bzr/mercurial/darcs are supported via a VCS
|
||||
plugin layer — an early "pluggable backend behind a stable interface" (adapter-contract
|
||||
echo).
|
||||
- **Plugins** (Perl) provide directives, feeds, auth (`openid`), and the federation hooks
|
||||
below.
|
||||
|
||||
## 2. Git-distributed federation + the pinger
|
||||
|
||||
Because the source is an ordinary VCS repo, ikiwiki instances federate the way *git* does:
|
||||
|
||||
- **Clone-and-diverge**: you can `git clone` a wiki, edit offline, and `push`/`pull` between
|
||||
instances — **multiple wiki clones that reconcile via git merge**. A wiki is a branch-space
|
||||
(UC-33).
|
||||
- **`pinger` / `pingee` plugins**: an instance can send an **XML-RPC ping** to another
|
||||
ikiwiki when it changes, prompting the other to **pull and rebuild** — a lightweight
|
||||
**subscribe/notify** primitive over the git-distributed mesh (UC-31).
|
||||
- **`aggregate` plugin**: pulls external **RSS/Atom feeds** into the wiki as pages — an
|
||||
inbound projection of remote content.
|
||||
|
||||
So ikiwiki is *federation by git plus a ping* — distinct from fedwiki's fork/journal
|
||||
(UC-72) and from Wikibase's query-time `SERVICE` (UC-74): a **third federation flavor**,
|
||||
*VCS-replication federation with change notification*.
|
||||
|
||||
## 3. Static output as a publish/projection target
|
||||
|
||||
The compiled static HTML is a **read-only, regenerable projection** of the source:
|
||||
|
||||
- It is a natural **outbound publish target** (UC-56): render the union (or a shard) to a
|
||||
static site for hosting/backup, no server needed.
|
||||
- It is also the **read-only backup** end (UC-37): a static snapshot that survives the engine.
|
||||
|
||||
The key shard-wiki framing: **source (git Markdown) is the attachable shard; static HTML is a
|
||||
derived projection** — never confuse the build output for the canonical content.
|
||||
|
||||
## 4. Capability profile
|
||||
|
||||
| Dimension (synthesis spectrum) | ikiwiki |
|
||||
|--------------------------------|---------|
|
||||
| Attachment mode | **file-store** (VCS-backed git Markdown source) |
|
||||
| Addressing granularity | page = source file; path = identity |
|
||||
| Content identity | path/filename (placement-bound) |
|
||||
| Structure | directory tree of Markdown source + directives |
|
||||
| History | **native VCS (git) history** |
|
||||
| Merge model | **git** (clone/pull/push/merge across instances) |
|
||||
| Native query | none; directives + plugins compute derived pages at build |
|
||||
| Translation | Markdown source → static HTML (build-time render) |
|
||||
| Write granularity | **file (page)** per commit |
|
||||
| Operational envelope | a compiler + VCS hook; static hosting for output |
|
||||
| Access grant | VCS/file perms; `openid` for web edits |
|
||||
| Content opacity | transparent Markdown |
|
||||
| Provenance | git author/timestamp; aggregated feeds carry source |
|
||||
| Federation | **git replication + XML-RPC ping**; RSS aggregation |
|
||||
|
||||
## 5. INTENT mapping
|
||||
|
||||
### Reinforcements
|
||||
|
||||
- **Git-canonical Markdown source** = the home case (shared with forge wikis UC-76/40):
|
||||
attach the source repo, adopt its git log as the journal, write by commit.
|
||||
- **Coordination layer is git** (INTENT): ikiwiki's whole federation *is* git replication +
|
||||
a ping — the most literal realization of "Git-addressable coordination layer."
|
||||
- **Projection vs canonical**: compile-to-static cleanly separates **canonical source** from
|
||||
**derived output** — exactly shard-wiki's projection principle (static HTML = a lazy/cache
|
||||
projection that is regenerable, never the source of truth).
|
||||
- **Graceful degradation / publish**: static output is the trivial read-only backup and
|
||||
outbound publish target (UC-37/UC-56).
|
||||
- **Subscribe/notify mechanism, not policy**: the pinger is a *mechanism* (notify a peer to
|
||||
pull); *which* peers, when, and conflict policy stay configurable.
|
||||
|
||||
### Divergences (boundaries / notes)
|
||||
|
||||
- ikiwiki is mostly a **reinforcement** of git-canonical-Markdown (UC-76) — its *new*
|
||||
contributions are (a) **compile-to-static** as a distinct projection/publish direction and
|
||||
(b) the **git-distributed-clone + ping** federation flavor. The static output is *not* a
|
||||
shard to attach (it's derived); attach the **source repo**.
|
||||
- VCS-agnostic backend layer is an interesting adapter-contract echo, but for shard-wiki the
|
||||
git case dominates.
|
||||
|
||||
### What to keep
|
||||
|
||||
1. **Source-repo-as-shard, static-output-as-derived-projection** — never attach the build
|
||||
output as canonical (UC-79; relates UC-37/UC-56).
|
||||
2. **Git-replication + change-ping** as a named federation flavor beside fork/journal
|
||||
(fedwiki) and query-SERVICE (Wikibase) — a **subscribe/notify over a git mesh** (UC-31/UC-33).
|
||||
3. **Inbound feed aggregation** (RSS/Atom → pages) as an inbound projection pattern.
|
||||
|
||||
## 6. UC seed
|
||||
|
||||
| # | Seed | Disposition |
|
||||
|---|------|-------------|
|
||||
| UC-79 | Attach a git-backed **compile-to-static** wiki — the **git Markdown source is the shard**, compiled **static HTML is a derived publish/projection**; participate in **git-distributed clone federation** with change-**pings** | **new** |
|
||||
| — | XML-RPC pinger = subscribe/notify over a git mesh | enrich **UC-31** |
|
||||
| — | render union/shard to a static site (publish) | enrich **UC-56** |
|
||||
| — | static HTML as read-only regenerable backup | enrich **UC-37** |
|
||||
| — | wiki as a git branch-space; clones reconcile via merge | enrich **UC-33** |
|
||||
|
||||
## 7. Architecture notes for SHARD-WP-0002
|
||||
|
||||
- **T4 (federation):** record **git-replication + change-ping** as a federation flavor — peers
|
||||
hold git clones, reconcile via merge, and notify with a ping to pull/rebuild. Distinct from
|
||||
fedwiki fork/journal and Wikibase `SERVICE`; complements the "Git-addressable coordination
|
||||
layer" mandate most literally.
|
||||
- **T6 (publish/projection):** **compile-to-static** is the canonical *outbound* projection —
|
||||
source (git) is canonical, static HTML is a regenerable derived view (publish/backup
|
||||
target). Reinforces projection-vs-canonical separation.
|
||||
- **Attach binding (T14):** attach the **source repo** (git Markdown), not the build output;
|
||||
shares the forge-wiki / `wiki/`-subdir git+Markdown adapter (parameterized by repo/path).
|
||||
|
||||
## 8. Open questions
|
||||
|
||||
1. Is the **pinger** (notify-peer-to-pull) modeled as shard-wiki's own subscribe/notify
|
||||
primitive, or only recognized when bridging two ikiwiki instances?
|
||||
2. Does shard-wiki ever *drive* a compile-to-static **publish** of the union (act as the
|
||||
ikiwiki-like compiler), or only attach existing ikiwiki source repos? (UC-56 scope.)
|
||||
3. Is **feed aggregation** (RSS→pages) an inbound projection mode shard-wiki offers generally
|
||||
(a feed-shard), or an ikiwiki-internal feature?
|
||||
|
||||
## 9. Sources
|
||||
|
||||
- ikiwiki.info — *ikiwiki* overview, *rcs* (VCS backends), *plugins* (`pinger`/`pingee`,
|
||||
`aggregate`, `openid`), *setup* / post-commit hooks
|
||||
- prior: `research/260608-federation-concepts/` (ikiwiki reference);
|
||||
`research/260614-forge-wikis-deep-dive/` (git-canonical Markdown contrast)
|
||||
|
||||
## 10. Traceability
|
||||
|
||||
New UC **UC-79** carries the marker **⊟** in the wikiengines column of
|
||||
`spec/UseCaseCatalog.md`. Enriched: UC-31, UC-56, UC-37, UC-33. Architecture cross-refs:
|
||||
SHARD-WP-0002 T4 (git-replication+ping federation), T6 (compile-to-static publish), T14.
|
||||
35
research/260614-jupyter-deep-dive/README.md
Normal file
35
research/260614-jupyter-deep-dive/README.md
Normal file
@@ -0,0 +1,35 @@
|
||||
# 260614 — Jupyter Notebooks deep dive
|
||||
|
||||
Date: 2026-06-14 · Source: **SHARD-WP-0004 T3**
|
||||
|
||||
## What this is
|
||||
|
||||
A deep dive into **Jupyter Notebooks**: the **`.ipynb` JSON** document (ordered cells:
|
||||
markdown / code+outputs / raw), **kernels**, **embedded computed outputs** (MIME bundles),
|
||||
and **execution-count provenance**. The dominant modern computational document and the
|
||||
concrete case where the **derived output is captured and stored *inside* the source** with
|
||||
real-but-fragile provenance.
|
||||
|
||||
## Why it matters
|
||||
|
||||
- Tests the T1 **replication- vs derivation-projection** split on the dominant real artifact
|
||||
and adds the wrinkle that **outputs are stored back inside the source** — the source/
|
||||
projection line runs *through* the document.
|
||||
- The page model (T12) must carry a **notebook shape**: ordered cells with code cells owning
|
||||
embedded computed outputs that have **weak execution provenance** (run order, environment
|
||||
not captured).
|
||||
- Non-Markdown + lossy translation (T15): JSON+MIME bundles; nbconvert→Markdown is lossy and
|
||||
directional. JSON diffs are noisy → Jupytext text-pairing / nbdime (T13).
|
||||
|
||||
## Yield
|
||||
|
||||
- **UC-84** (new): attach/project a computational notebook preserving cell structure +
|
||||
embedded outputs, surfacing outputs as **snapshots with weak execution provenance**;
|
||||
re-execution **capability-gated**, default = present snapshot + static render.
|
||||
- Enrich **UC-54, UC-55, UC-59, UC-35**; links **UC-32, UC-83, UC-79**.
|
||||
|
||||
## Contents
|
||||
|
||||
| Path | Role |
|
||||
|------|------|
|
||||
| `findings.md` | `.ipynb` model, kernels/execution-count fragility, ecosystem (nbconvert/Jupytext/papermill/nbdime/nbstripout), capability profile, INTENT mapping, UC seed, architecture notes, open questions |
|
||||
185
research/260614-jupyter-deep-dive/findings.md
Normal file
185
research/260614-jupyter-deep-dive/findings.md
Normal file
@@ -0,0 +1,185 @@
|
||||
# Jupyter Notebooks — deep dive (findings)
|
||||
|
||||
**Date:** 2026-06-14 · **Source:** SHARD-WP-0004 T3 · **Subject:** Jupyter Notebooks — the
|
||||
`.ipynb` JSON document, kernels, embedded computed outputs, execution provenance.
|
||||
|
||||
## Why this dive
|
||||
|
||||
T1 (literate programming) established **one source → derived projections** and split
|
||||
**replication-projection** from **derivation-projection**. Jupyter is the *dominant modern
|
||||
computational document* and the concrete case where the **derived output is captured and
|
||||
stored inside the source** — a non-Markdown, partially-executable content type whose
|
||||
provenance is real but **fragile**. It is the most plausible concrete "computational shard"
|
||||
content type, so it tests the page model (T12), lossy translation (T15), and the
|
||||
output-provenance question head-on.
|
||||
|
||||
## 1. The `.ipynb` document model
|
||||
|
||||
A notebook is a single **JSON document** (`nbformat`), not Markdown:
|
||||
|
||||
- **`cells[]`** — an ordered list. Each cell has a `cell_type`:
|
||||
- `markdown` — prose (Markdown + LaTeX), the human-readable part.
|
||||
- `code` — source text (`source`), plus an **`execution_count`** and an **`outputs[]`**
|
||||
array captured from the last run.
|
||||
- `raw` — passthrough.
|
||||
- **`outputs[]`** (per code cell) carry results inline: `stream` (stdout/stderr),
|
||||
`execute_result` / `display_data` (a **MIME bundle** — `text/plain`, `text/html`,
|
||||
`image/png` base64, `application/json`, vendor MIME types), and `error` (traceback).
|
||||
- **`metadata`** at notebook and cell level (`kernelspec`, `language_info`, tags like
|
||||
`hide-input`, `scrolled`, slideshow roles).
|
||||
|
||||
So an `.ipynb` is **source + last-run computed outputs + environment metadata, fused in one
|
||||
JSON file**. The Markdown cells are an *island* inside a JSON envelope — relevant to how
|
||||
shard-wiki extracts/round-trips content.
|
||||
|
||||
## 2. Kernels and execution
|
||||
|
||||
- A **kernel** is a separate language process (IPython, IRkernel, IJulia, …) speaking the
|
||||
Jupyter messaging protocol (ZeroMQ). The document is **decoupled from the kernel**: the
|
||||
`.ipynb` persists *captured* outputs; re-running requires a live kernel + the right
|
||||
environment.
|
||||
- **`execution_count`** numbers the order cells were *run*, which **need not match document
|
||||
order** — the infamous **hidden-state / out-of-order execution** problem: stored outputs
|
||||
may reflect a run sequence that no longer corresponds to top-to-bottom reading.
|
||||
- Reproducibility therefore depends on **out-of-band state**: package versions, data files,
|
||||
environment, random seeds — none captured by `nbformat` itself.
|
||||
|
||||
**Consequence for shard-wiki:** the captured outputs are a **snapshot projection with weak
|
||||
provenance** — honest treatment must mark them as "computed at run N, environment not
|
||||
guaranteed," never as live or authoritative truth.
|
||||
|
||||
## 3. The ecosystem (relevant to attach/project/translate)
|
||||
|
||||
- **nbconvert** — derives other forms from a notebook: HTML, Markdown, LaTeX/PDF, slides,
|
||||
script. This is **derivation-projection** (T1): notebook source → rendered view, lossy in
|
||||
both directions (HTML keeps outputs; `--to script` keeps only code, like `tangle`).
|
||||
- **Jupytext** — represents a notebook **as** a `.py`/`.md` text file (pairing), making it
|
||||
**git-diffable plain text** and round-trippable — directly relevant to storing notebooks
|
||||
in a git shard without JSON-diff noise.
|
||||
- **papermill** — parameterize + execute a notebook to produce a new output notebook
|
||||
(notebook as a runnable template — a *derivation with inputs*).
|
||||
- **JupyterLab / Notebook / nbviewer / Colab** — front-ends; nbviewer renders a static
|
||||
read-only projection from a URL (a natural projection target).
|
||||
- **`nbstripout`** — strips outputs before commit: teams treat **outputs as derived noise**,
|
||||
keeping only source under version control — an explicit "source canonical, outputs
|
||||
derived" stance mirroring T1.
|
||||
|
||||
## 4. Capability profile (as a shard / content type)
|
||||
|
||||
| Dimension (synthesis spectrum) | Jupyter notebook |
|
||||
|--------------------------------|------------------|
|
||||
| Attachment mode | file-store (`.ipynb` JSON in a repo) or via Jupyter Server REST API |
|
||||
| Addressing granularity | document; **cell** as sub-address (by index / id; `nbformat 4.5+` adds stable cell `id`) |
|
||||
| Content identity | file path; cell `id` (4.5+) else positional |
|
||||
| Structure | **ordered cell list** (markdown / code+outputs / raw); MIME-bundle outputs |
|
||||
| History | VCS on the file; **JSON diffs are noisy** unless paired (Jupytext) or stripped |
|
||||
| Merge model | git on JSON (poor) → **paired text** (good) or nbdime (cell-aware diff/merge) |
|
||||
| Native query | none |
|
||||
| Translation | nbconvert → HTML/MD/script/PDF (lossy, directional); Jupytext text pairing |
|
||||
| Write granularity | file / **cell** |
|
||||
| Operational envelope | a kernel + environment to (re)execute; static render needs none |
|
||||
| Content opacity | **mixed**: source transparent; outputs = MIME blobs (some opaque, e.g. base64 PNG) |
|
||||
| Provenance | `execution_count` (weak, out-of-order); environment **not** captured |
|
||||
| **Computed-output** | **stored inline**, snapshot, reproducibility out-of-band |
|
||||
|
||||
## 5. INTENT mapping
|
||||
|
||||
### Reinforcements
|
||||
|
||||
- **Replication- vs derivation-projection (T1) confirmed and extended.** nbconvert (→HTML/
|
||||
script) and nbviewer are derivation-projections; `--to script` is literally `tangle`.
|
||||
Jupyter adds a third wrinkle: **the derived output is also stored back inside the source**
|
||||
(captured outputs), so the "source vs projection" line runs *through* the document.
|
||||
- **Union without erasure / provenance honesty.** Captured outputs must be surfaced **as
|
||||
snapshots with weak provenance** (run N, environment unguaranteed) — a concrete instance
|
||||
of "never hide freshness/authorship." The out-of-order `execution_count` is exactly the
|
||||
kind of fragility shard-wiki must *show*, not paper over.
|
||||
- **Non-Markdown content + lossy translation (UC-55/UC-59).** `.ipynb` is JSON with embedded
|
||||
MIME-bundle outputs; any Markdown projection is **lossy** (loses live outputs, kernel,
|
||||
rich MIME). Surface the lossiness; keep the JSON as canonical payload (T12/T15).
|
||||
- **Markdown island.** Markdown cells fit the text-first model, but only as *fragments
|
||||
inside* a JSON envelope — the adapter extracts/round-trips them, it does not pretend the
|
||||
notebook is a Markdown page.
|
||||
|
||||
### Divergences / boundaries
|
||||
|
||||
- **shard-wiki is not a kernel host.** Re-execution (driving a kernel) is out of scope/
|
||||
capability-gated; default treatment is **attach + present captured outputs as a snapshot
|
||||
projection** + offer nbconvert-style static render. Executing/parameterizing (papermill)
|
||||
is an optional capability, never assumed.
|
||||
- **Outputs-in-source is an anti-pattern to respect, not adopt.** Teams strip/pair outputs
|
||||
precisely because mixing derived data into the source breaks diffs. shard-wiki should
|
||||
prefer the **source-canonical, outputs-as-derived** reading (Jupytext pairing / nbstripout
|
||||
ethos) and treat stored outputs as a capturable projection.
|
||||
|
||||
### What to keep
|
||||
|
||||
1. **Computational-notebook as a first-class content type** with cell structure + inline
|
||||
**computed outputs carrying (weak) execution provenance** — UC-84.
|
||||
2. **Outputs = derivation-projection snapshot** (T1 vocabulary): regenerable only with a
|
||||
kernel+environment; degrade gracefully to the stored snapshot / static render.
|
||||
3. **Cell-level addressing** (stable cell `id`, nbformat 4.5+) as the sub-page granularity
|
||||
for transclusion/anchoring (UC-32/UC-35).
|
||||
4. **Text-pairing (Jupytext)** as the git-friendly storage strategy — feeds the
|
||||
history-portability thread (poor JSON diffs → paired text / nbdime).
|
||||
|
||||
## 6. UC seed
|
||||
|
||||
| # | Seed | Disposition |
|
||||
|---|------|-------------|
|
||||
| UC-84 | Attach/project a **computational notebook** (`.ipynb`): preserve **cell structure** (markdown / code / output) and **embedded computed outputs**, surfacing each output **as a snapshot with its (weak) execution provenance** (run count, environment not guaranteed) — re-execution is **capability-gated**, default is present-the-snapshot + offer a static rendered projection | **new** |
|
||||
| — | Notebook JSON / MIME-bundle outputs = non-Markdown content; Markdown projection is lossy | enrich **UC-55**, **UC-59** |
|
||||
| — | Computed/evaluated cell = computation-defined content | enrich **UC-54** |
|
||||
| — | Cell `id` (nbformat 4.5+) = sub-page address for anchor/transclusion | enrich **UC-35**, links **UC-32** |
|
||||
| — | Stored outputs as derived snapshot (nbstripout/Jupytext ethos) = source-canonical/outputs-derived | links **UC-83**, **UC-79** |
|
||||
|
||||
## 7. Architecture notes for SHARD-WP-0002
|
||||
|
||||
- **T12 (page model):** add **computational-notebook** as a page shape — an **ordered cell
|
||||
list** where code cells own **embedded computed outputs** (MIME bundles) with weak
|
||||
execution provenance. Distinct from prose, typed records, query-defined, inline-embedded
|
||||
objects (Quip/Notion), typed-graph (Wikibase), and the literate one-source-many-projection
|
||||
shape (UC-83). The defining new attribute: **derived output stored *inside* the source**.
|
||||
- **T15 (translation / fidelity):** `.ipynb` is non-Markdown; nbconvert→Markdown is **lossy
|
||||
and directional** (drops live outputs/kernel/rich MIME). Keep JSON canonical; any Markdown
|
||||
is a projection. MIME-bundle outputs map to the content-opacity spectrum (text→html→base64
|
||||
image = transparent→opaque).
|
||||
- **T13 (history):** JSON diffs are **noisy**; record **text-pairing (Jupytext)** and
|
||||
**cell-aware diff/merge (nbdime)** as history-portability strategies for embedded-output
|
||||
documents. Reinforces "source-canonical, outputs-derived."
|
||||
- **T16 (projection):** captured outputs are a **derivation-projection snapshot**;
|
||||
re-execution (kernel) and parameterized execution (papermill) are **capabilities**, not
|
||||
assumptions; degrade to the stored snapshot / nbviewer-style static render.
|
||||
|
||||
## 8. Open questions
|
||||
|
||||
1. Does shard-wiki ever **re-execute** a notebook (host/broker a kernel), or strictly
|
||||
attach + present captured outputs + static render? (Same scope boundary as UC-83/UC-56
|
||||
"do we ever drive the derivation.")
|
||||
2. Is **UC-84** distinct from **UC-83**, or is a notebook just the "outputs-stored-in-source"
|
||||
special case of the literate one-source-many-projection pattern? (Kept separate: UC-84's
|
||||
defining trait is *captured derived output embedded in the canonical source with weak
|
||||
provenance* — a page-model attribute UC-83 doesn't carry.)
|
||||
3. How are **MIME-bundle outputs** represented in the page model — opaque provenance-tagged
|
||||
blobs, a typed-asset registry (UC-55 open question #10), or selected-MIME projection?
|
||||
4. Default storage: attach `.ipynb` **as-is** (JSON, noisy diffs) or prefer a **paired text
|
||||
representation** when the shard is a git repo? (Policy → configurable.)
|
||||
|
||||
## 9. Sources
|
||||
|
||||
- Jupyter `nbformat` reference (cells, outputs, MIME bundles, cell `id` 4.5+);
|
||||
Jupyter messaging protocol / kernels docs.
|
||||
- **nbconvert**, **nbviewer**, **JupyterLab**, **Colab** docs.
|
||||
- **Jupytext**, **papermill**, **nbdime**, **nbstripout** project docs.
|
||||
- prior: `research/260614-literate-programming-deep-dive/` (replication- vs
|
||||
derivation-projection, UC-83); `research/260614-notion-deep-dive/` (block-JSON,
|
||||
external-API), `research/260614-quip-deep-dive/` (inline embedded objects, UC-55/58/59).
|
||||
|
||||
## 10. Traceability
|
||||
|
||||
New UC **UC-84** carries the marker **⊜** in the wikiengines column of
|
||||
`spec/UseCaseCatalog.md` (true lineage = this dive). Enriched: UC-54, UC-55, UC-59, UC-35;
|
||||
links UC-32, UC-83, UC-79. Architecture cross-refs: SHARD-WP-0002 T12 (notebook page shape:
|
||||
outputs embedded in source), T15 (lossy non-Markdown translation; MIME opacity), T13
|
||||
(paired-text / nbdime history), T16 (output = derivation-projection snapshot; kernel =
|
||||
capability).
|
||||
35
research/260614-literate-programming-deep-dive/README.md
Normal file
35
research/260614-literate-programming-deep-dive/README.md
Normal file
@@ -0,0 +1,35 @@
|
||||
# 260614 — Literate Programming (Knuth's WEB / weave / tangle) deep dive
|
||||
|
||||
Date: 2026-06-14 · Source: **SHARD-WP-0004 T1**
|
||||
|
||||
## What this is
|
||||
|
||||
A deep dive into **literate programming** — Knuth's WEB and the **`weave`/`tangle`** model
|
||||
— as the deepest ancestor of shard-wiki's **projection** and **transclusion** ideas
|
||||
applied to *executable* content. The keystone: **one canonical source → two co-derived
|
||||
projections** (typeset docs via `weave`, compilable code via `tangle`), plus **named code
|
||||
chunks** assembled by reference (transclusion).
|
||||
|
||||
## Why it matters
|
||||
|
||||
- Establishes **one-source-many-projections** as a page-model + projection pattern that
|
||||
*generalizes* compile-to-static (UC-79, single output) to **N co-equal, semantically
|
||||
different** derived views — feeds SHARD-WP-0002 **T12/T16**.
|
||||
- Splits projection into **replication-projection** (lazy cache, current default) vs
|
||||
**derivation-projection** (transform/compile/weave/evaluate) — a distinction the rest of
|
||||
the computational batch (notebooks, REPLs) extends.
|
||||
- Named chunks are the **executable-content face of transclusion / compose-by-reference**
|
||||
(UC-32 / UC-44).
|
||||
|
||||
## Yield
|
||||
|
||||
- **UC-83** (new): attach a single-source-multiple-projection (literate) artifact; present
|
||||
each derived view with output→source provenance; edits target the source.
|
||||
- Enrich **UC-32**, **UC-44**, **UC-79**; links **UC-54** (computed/evaluated projection,
|
||||
→ T3 Jupyter).
|
||||
|
||||
## Contents
|
||||
|
||||
| Path | Role |
|
||||
|------|------|
|
||||
| `findings.md` | WEB model, named-chunk transclusion, descendants (noweb/org-babel/knitr/Jupytext), capability profile, INTENT mapping, UC seed, architecture notes, open questions |
|
||||
180
research/260614-literate-programming-deep-dive/findings.md
Normal file
180
research/260614-literate-programming-deep-dive/findings.md
Normal file
@@ -0,0 +1,180 @@
|
||||
# Literate Programming (Knuth's WEB / weave / tangle) — deep dive (findings)
|
||||
|
||||
**Date:** 2026-06-14 · **Source:** SHARD-WP-0004 T1 · **Subject:** Donald Knuth's WEB
|
||||
system and the literate-programming model (`weave`/`tangle`, named chunks).
|
||||
|
||||
## Why this dive
|
||||
|
||||
SHARD-WP-0004 asks the carried question: *can a shard-wiki page be a source that is
|
||||
woven/evaluated into rendered forms, and how do projection/transclusion/provenance treat
|
||||
the source vs the output?* Literate programming is the **deepest ancestor** of that idea.
|
||||
Knuth (1984) inverted the program/comment relationship: you write a **document** for
|
||||
humans whose fragments happen to also be the program. From the **one WEB source** two
|
||||
tools derive two artifacts: **`weave` → typeset documentation** (TeX) and **`tangle` →
|
||||
compilable code** (Pascal/C/…). This is *one source, two projections* in its purest,
|
||||
oldest form — the conceptual root of shard-wiki's **projection** and **transclusion**.
|
||||
|
||||
## 1. The WEB model: one source, two tools
|
||||
|
||||
A WEB file interleaves prose and code in author-chosen order (the order that best
|
||||
*explains*, not the order the compiler needs). Two programs read it:
|
||||
|
||||
- **`weave`** produces a `.tex` file → typeset documentation: prose, pretty-printed code,
|
||||
a cross-reference index of where each chunk and identifier is defined/used.
|
||||
- **`tangle`** produces a compilable source file → reorders and expands the code chunks
|
||||
into the sequence the compiler demands, strips the prose, macro-expands references.
|
||||
|
||||
The crucial property: **the two outputs are co-derived from one canonical source and are
|
||||
semantically different audiences** (human reader vs compiler). Neither output is the
|
||||
source; both are **regenerable, disposable derivations**. Editing happens on the WEB; you
|
||||
never edit the woven `.tex` or the tangled `.c`.
|
||||
|
||||
## 2. Named chunks = transclusion of code fragments
|
||||
|
||||
The organizing primitive is the **named section / code chunk**:
|
||||
|
||||
- A chunk is declared `<<name>>=` and **referenced** elsewhere as `<<name>>`. `tangle`
|
||||
expands references in place (recursively) to assemble the final program.
|
||||
- A chunk name can be **defined in multiple places** and is *accreted* (later `+=`
|
||||
additions append) — so one logical unit is authored across scattered locations.
|
||||
- References can appear before definitions; resolution is by name, not by position.
|
||||
|
||||
This is **transclusion by reference** (UC-32) and **compose-by-reference / manifest**
|
||||
(UC-44, the EDL/Xanadu lineage): the document is a graph of named fragments assembled at
|
||||
derivation time. Knuth's chunk graph is the same shape as Xanadu's reference-not-copy and
|
||||
shard-wiki's "compose a page from referenced parts" — applied to *executable* content and
|
||||
resolved by a build tool rather than a viewer.
|
||||
|
||||
## 3. The descendants (noweb, CWEB, org-babel, Sweave/knitr, Jupytext)
|
||||
|
||||
- **CWEB** (Knuth/Levy): WEB for C. **noweb** (Ramsey): language-agnostic, minimal markup
|
||||
(`<<chunk>>`), `notangle`/`noweave` — proof that the *model* (chunks + two projections)
|
||||
is separable from any one language or typesetter.
|
||||
- **org-babel** (Emacs Org-mode): named source blocks, `:noweb` references, **tangle** to
|
||||
files **and evaluate** blocks inline (results captured back into the document) — literate
|
||||
programming fused with notebook execution (bridges to T2/T3).
|
||||
- **Sweave / knitr** (R): weave prose + R, executing code and **interleaving computed
|
||||
results/figures** into the woven document — adds the *computed-output* dimension that
|
||||
Jupyter (T3) centers on.
|
||||
- **Jupytext**: represents a Jupyter notebook **as** a literate text/Markdown source —
|
||||
closing the loop: the notebook (T3) becomes a weave/tangle-style plain-text source.
|
||||
|
||||
The throughline: **one canonical source → N derived projections**, where projections may
|
||||
be (a) reformatted (weave), (b) reordered/extracted (tangle), or (c) **evaluated**
|
||||
(org-babel/knitr) — the evaluated case is exactly the computational-page question.
|
||||
|
||||
## 4. Capability profile (as a would-be shard / page type)
|
||||
|
||||
| Dimension (synthesis spectrum) | Literate-programming source |
|
||||
|--------------------------------|-----------------------------|
|
||||
| Attachment mode | file-store (a WEB/`.nw`/`.org` text source in a repo) |
|
||||
| Addressing granularity | document; **named chunk** as sub-page address |
|
||||
| Content identity | source file path + chunk name (name-resolved, not position) |
|
||||
| Structure | **graph of named chunks** assembled by reference |
|
||||
| History | whatever VCS holds the source (git) — text, diffable |
|
||||
| Merge model | text/git merge on the source |
|
||||
| Native query | none; `weave` emits a cross-reference index (derived) |
|
||||
| Translation | source → woven docs **and** tangled code (build-time) |
|
||||
| Write granularity | file / **chunk** (text region) |
|
||||
| Operational envelope | a build tool (`tangle`/`weave`/`noweb`/babel) |
|
||||
| Content opacity | transparent text |
|
||||
| Provenance | VCS author/time; chunk cross-ref maps output→source location |
|
||||
| **Projection model** | **one source → many co-equal derived projections** (new emphasis) |
|
||||
|
||||
## 5. INTENT mapping
|
||||
|
||||
### Reinforcements
|
||||
|
||||
- **Projection (canonical vs derived).** Literate programming is the archetype of
|
||||
"canonical source, regenerable derived view" — the same principle as ikiwiki
|
||||
compile-to-static (UC-79), but generalized: **two-plus co-equal projections** from one
|
||||
source (docs *and* code), not a single output. Strengthens the rule *never confuse a
|
||||
projection for the source*.
|
||||
- **Transclusion / compose-by-reference.** Named chunks are transclusion (UC-32) and a
|
||||
manifest of referenced parts (UC-44) — resolved at derivation time. Confirms
|
||||
transclusion=clone=embed=reference as one primitive that also covers *fragment assembly
|
||||
of executable content*.
|
||||
- **Markdown-first but backend-neutral page model.** noweb/org/Jupytext show the literate
|
||||
source can *be* Markdown-ish plain text — so a "literate page" fits the text-first model;
|
||||
the *derivations* are the non-text part.
|
||||
- **Mechanism over policy.** weave/tangle are mechanisms; *which* projections to
|
||||
materialize, when to regenerate, and where outputs go stay configurable.
|
||||
|
||||
### Divergences / boundaries
|
||||
|
||||
- **shard-wiki is not a build system.** It should *recognize and present* a
|
||||
source-with-projections (attach the source, surface derived views with provenance), not
|
||||
re-implement `tangle`/kernels. Materializing a projection may delegate to the source's
|
||||
own tool or be capability-gated to "snapshot only."
|
||||
- **The interesting projection is derivation, not caching.** shard-wiki's base projection
|
||||
is cache-like (lazy copy of remote content, UC-53/57). Weave/tangle is a *different*
|
||||
projection species: **transform/derive** one source into rendered forms. The contract
|
||||
should model projection as having (at least) two kinds: **replication-projection** and
|
||||
**derivation-projection** (compile/weave/evaluate).
|
||||
|
||||
### What to keep
|
||||
|
||||
1. **One-source-many-projections** as a first-class page-model + projection pattern
|
||||
(generalizes UC-79's single output) — see UC-83.
|
||||
2. **Named-chunk transclusion** as the executable-content face of UC-32/UC-44 (assembly by
|
||||
reference at derivation time).
|
||||
3. **Output→source provenance** (the woven cross-ref index): every derived view must point
|
||||
back to the exact source location it came from — never present derived output without
|
||||
that link (union-without-erasure for derivations).
|
||||
|
||||
## 6. UC seed
|
||||
|
||||
| # | Seed | Disposition |
|
||||
|---|------|-------------|
|
||||
| UC-83 | Attach a **single-source-multiple-projection** artifact (a literate/woven source): treat the source as canonical, present each **derived projection** (e.g. a documentation view *and* a code view) with **provenance back to the one source**, edits target the source and projections regenerate (delegated to the source's tool or degraded to a static snapshot) | **new** |
|
||||
| — | Named chunks `<<name>>` = transclusion / compose-by-reference of (executable) fragments | enrich **UC-32**, **UC-44** |
|
||||
| — | Generalize compile-to-static (single output) to **N co-equal projections** from one source | enrich **UC-79** |
|
||||
| — | Computed/evaluated projection (org-babel/knitr) = derivation-projection with results | links **UC-54**, foreshadows **T3** |
|
||||
|
||||
## 7. Architecture notes for SHARD-WP-0002
|
||||
|
||||
- **T12 (page model):** add **one-source-many-projections** as a page-model shape — a page
|
||||
may be a *source* whose presented forms are **derivations** (woven docs, tangled code,
|
||||
evaluated results), each carrying output→source provenance. Distinct from prose, typed
|
||||
records, query-defined, and inline-embedded objects already logged.
|
||||
- **T16 (projection):** split projection into **replication-projection** (lazy cache of
|
||||
remote content — current default) vs **derivation-projection** (transform/compile/weave/
|
||||
evaluate a source into rendered forms). Derivation-projection is regenerable, may
|
||||
delegate to an external tool, and degrades to a captured snapshot when the tool is
|
||||
absent (graceful degradation).
|
||||
- **Transclusion (T16):** named-chunk-by-name resolution is a transclusion variant where
|
||||
the *target is a fragment resolved by name across the source*, assembled at derivation
|
||||
time — a concrete shape for UC-32/UC-44 mechanics.
|
||||
- **Capability gating:** "can derive projection X" is a capability; a shard that can't run
|
||||
the tool still exposes the source + any pre-built/snapshot projections (UC-83 degrade).
|
||||
|
||||
## 8. Open questions
|
||||
|
||||
1. Does shard-wiki ever *drive* a derivation (run weave/tangle/evaluate), or only attach
|
||||
sources and surface pre-built projections + snapshots? (Same scope question as UC-56
|
||||
"do we ever compile-to-static ourselves," now for literate sources.)
|
||||
2. Is **UC-83** distinct enough from UC-79 (compile-to-static) to stand alone, or should
|
||||
UC-79 be re-read as the single-output special case of UC-83's N-projection general case?
|
||||
(Recorded as a possible later consolidation; kept separate now because UC-83's
|
||||
projections are *co-equal and semantically different audiences*, not one publish target.)
|
||||
3. How is **output→source provenance** represented when a derived line came from a chunk
|
||||
accreted across several source locations (the cross-ref is many-to-one)?
|
||||
|
||||
## 9. Sources
|
||||
|
||||
- Knuth, *Literate Programming* (1984, *Computer Journal*); the WEB user manual; *TeX: The
|
||||
Program* / *MMIX* as canonical WEB exemplars.
|
||||
- Ramsey, *noweb* (a simple, extensible literate-programming tool).
|
||||
- CWEB (Knuth & Levy); Emacs **Org-mode babel** (tangle + evaluate); **Sweave**/**knitr**
|
||||
(R); **Jupytext** (notebook-as-text).
|
||||
- prior: `research/260614-ikiwiki-deep-dive/` (compile-to-static, canonical-vs-derived);
|
||||
`research/260614-xanadu-deep-dive/` (compose-by-reference / EDL, UC-44).
|
||||
|
||||
## 10. Traceability
|
||||
|
||||
New UC **UC-83** carries the marker **⊛** in the federation column of
|
||||
`spec/UseCaseCatalog.md` (true lineage = this dive; literate programming is design prior
|
||||
art, not a candidate shard, so the marker sits with the projection/compose-by-reference
|
||||
family). Enriched: UC-32, UC-44, UC-79; links UC-54. Architecture cross-refs: SHARD-WP-0002
|
||||
T12 (one-source-many-projections page shape), T16 (replication- vs derivation-projection;
|
||||
named-chunk transclusion).
|
||||
32
research/260614-mathematica-deep-dive/README.md
Normal file
32
research/260614-mathematica-deep-dive/README.md
Normal file
@@ -0,0 +1,32 @@
|
||||
# 260614 — Mathematica Notebooks deep dive
|
||||
|
||||
Date: 2026-06-14 · Source: **SHARD-WP-0004 T2**
|
||||
|
||||
## What this is
|
||||
|
||||
A deep dive into **Wolfram Mathematica Notebooks** — the **original computational notebook**
|
||||
(1988), the ancestor Jupyter descends from. The `.nb` document **is itself a Wolfram Language
|
||||
expression** (`Notebook[{Cell[…], …}]`) with **nested cell groups**, kernel evaluation
|
||||
(`In`/`Out`), **structured (symbolic/graphics) results**, live `Manipulate`/`Dynamic`
|
||||
widgets, and CDF as a reduced-runtime distribution projection.
|
||||
|
||||
## Why it matters
|
||||
|
||||
- Confirms the **notebook page shape (UC-84) is a genus**, not a Jupyter quirk: cells +
|
||||
cached outputs + fragile counter provenance + kernel-gated re-execution predate `.ipynb`.
|
||||
- Two refinements: notebooks have **nestable cell groups (an outline tree)**, not just an
|
||||
ordered list; outputs can be **structured re-evaluable values**, not only MIME blobs →
|
||||
adds a "structured re-evaluable value" point to the content-opacity spectrum.
|
||||
- `Manipulate`/`Dynamic` join the **static-projection-impossible** end (snapshot-only) —
|
||||
foreshadows Strudel (T5).
|
||||
|
||||
## Yield
|
||||
|
||||
- **No new UC** (lineage/reinforcement). Reinforces **UC-84**; enriches **UC-54, UC-55**;
|
||||
links UC-34, UC-83.
|
||||
|
||||
## Contents
|
||||
|
||||
| Path | Role |
|
||||
|------|------|
|
||||
| `findings.md` | `.nb` expression/cell model, evaluation & provenance, CDF, capability delta vs Jupyter, INTENT mapping, UC disposition (enrichment-only), architecture notes |
|
||||
139
research/260614-mathematica-deep-dive/findings.md
Normal file
139
research/260614-mathematica-deep-dive/findings.md
Normal file
@@ -0,0 +1,139 @@
|
||||
# Mathematica Notebooks — deep dive (findings)
|
||||
|
||||
**Date:** 2026-06-14 · **Source:** SHARD-WP-0004 T2 · **Subject:** Wolfram Mathematica
|
||||
Notebooks — the `.nb` format, the cell/expression model, symbolic evaluation.
|
||||
|
||||
## Why this dive
|
||||
|
||||
Mathematica (1988) is the **original computational notebook** — the ancestor Jupyter (T3)
|
||||
descends from. This dive checks whether the notebook page-model conclusions from UC-84 hold
|
||||
at the genus level or need extension. It is **medium priority / lineage**: a candidate
|
||||
*content type* like Jupyter, but closed/proprietary, so the expected yield is **reinforcement
|
||||
of UC-84** plus a couple of distinct wrinkles (the document *is itself* a Wolfram-language
|
||||
expression; symbolic, not just textual, results).
|
||||
|
||||
## 1. The `.nb` document model
|
||||
|
||||
A Mathematica notebook is itself a **Wolfram Language expression** — `Notebook[{Cell[...],
|
||||
Cell[...], ...}, opts]`. So the document format and the language are the *same* substrate:
|
||||
|
||||
- **Cells** are typed: `Input`, `Output`, `Text`, `Title`/`Section` (structure), `Code`,
|
||||
with **cell groups** nesting into an **outline tree** (the document has real hierarchy, not
|
||||
just a flat list like `.ipynb`).
|
||||
- An **`Input` cell** holds an expression; evaluating it produces a linked **`Output` cell`**
|
||||
containing the **result expression** (symbolic, graphical via `Graphics[...]`, or
|
||||
typeset). The result is a **first-class expression**, not a MIME blob — it can be
|
||||
re-evaluated, edited, transcluded.
|
||||
- The whole `.nb` is **plain-text-serializable** (it's an expression) but verbose and
|
||||
proprietary in conventions; output cells are **cached results** stored in the file.
|
||||
|
||||
So like Jupyter: **source + cached computed output fused in one document**, with **out-of-
|
||||
band reproducibility** (kernel + package/version state). Unlike Jupyter: **nested cell
|
||||
groups (a tree)** and **results that are structured Wolfram expressions** rather than MIME
|
||||
bundles.
|
||||
|
||||
## 2. Evaluation, provenance, reproducibility
|
||||
|
||||
- A **kernel** evaluates input cells; `In[n]`/`Out[n]` counters mirror Jupyter's
|
||||
`execution_count` and carry the **same fragility** (out-of-order evaluation, hidden
|
||||
symbol/global state, kernel-version dependence). No environment capture in the file.
|
||||
- **Dynamic/interactive output** (`Manipulate`, `Dynamic`) embeds **live interactive
|
||||
widgets** whose state is computed on view — these have **no faithful static form** beyond a
|
||||
snapshot frame (echoes Strudel T5's "live, time/interaction-based content" limit).
|
||||
- **CDF (Computable Document Format)** is Wolfram's *projection-for-distribution*: a notebook
|
||||
rendered with a free runtime so readers can interact without a full license — a
|
||||
derivation-projection (T1) with a reduced-capability viewer.
|
||||
|
||||
## 3. Capability profile (delta vs Jupyter UC-84)
|
||||
|
||||
| Dimension | Mathematica `.nb` (delta from Jupyter) |
|
||||
|-----------|----------------------------------------|
|
||||
| Structure | **nested cell groups → outline tree** (richer than `.ipynb`'s flat cell list) |
|
||||
| Output type | **structured Wolfram expressions** (symbolic/graphics), not MIME blobs |
|
||||
| Document = language | the `.nb` **is** a Wolfram expression (format ≡ language) |
|
||||
| Liveness | `Manipulate`/`Dynamic` = **interactive widgets**, snapshot-only when static |
|
||||
| Opacity | proprietary serialization; results re-evaluable only with a Wolfram kernel |
|
||||
| Projection-for-distribution | **CDF** = reduced-runtime interactive projection |
|
||||
| Otherwise | same as UC-84: cells, cached outputs, fragile `In/Out` provenance, kernel-gated |
|
||||
|
||||
## 4. INTENT mapping
|
||||
|
||||
### Reinforcements (mostly confirms UC-84)
|
||||
|
||||
- **Notebook page shape (UC-84) is a genus, not a Jupyter quirk.** Mathematica predates and
|
||||
matches it: cells + cached computed outputs + fragile counter provenance + kernel-gated
|
||||
re-execution. Confirms the page model should carry a **notebook shape** generically (T12),
|
||||
not a `.ipynb`-specific one.
|
||||
- **Outputs as derivation-projection snapshots (UC-83/84).** Cached `Output` cells are
|
||||
snapshots; honest treatment marks them "evaluated run N, kernel/env unguaranteed."
|
||||
- **Derivation-projection for distribution (T1).** CDF is a clean "reduced-capability
|
||||
interactive projection of a source" — a real-world instance of degrade-by-capability.
|
||||
|
||||
### Distinct wrinkles (extend the notes, not new UCs)
|
||||
|
||||
- **Nested cell-group outline** — the notebook page model should allow **hierarchical cell
|
||||
grouping**, not just an ordered list (generalize UC-84's "ordered cells" to "ordered/
|
||||
*nestable* cells"). Feeds T12.
|
||||
- **Structured (non-MIME) results** — outputs can be **typed structured values** (symbolic
|
||||
expressions), not only MIME blobs; reinforces UC-55's "typed asset" reading over "opaque
|
||||
blob," and links the typed-record page model (UC-34) — the content-opacity spectrum needs a
|
||||
"structured re-evaluable value" point, not just text↔blob.
|
||||
- **Format ≡ language** — a curiosity, not actionable for us beyond noting that some shards'
|
||||
document format is *the same artifact* as their computation (don't assume doc/code split).
|
||||
- **Live interactive widgets** — `Manipulate`/`Dynamic` join Strudel (T5) at the
|
||||
**static-projection-impossible** end: capture a snapshot frame, never imply interactivity.
|
||||
|
||||
### Boundaries
|
||||
|
||||
- Proprietary + kernel-gated → default **read/projection/snapshot**; attach the `.nb`
|
||||
(or an exported form), present cached outputs as snapshots, offer a static/CDF projection;
|
||||
**no kernel host** (same rule as Jupyter UC-84, GT T7).
|
||||
|
||||
## 5. UC disposition (enrichment-only — no new UC)
|
||||
|
||||
| Mechanism (findings §) | Catalog UC |
|
||||
|------------------------|------------|
|
||||
| Cells + cached computed outputs + fragile In/Out provenance; kernel-gated (§1, §2) | UC-84 (reinforced) |
|
||||
| Nested cell groups → outline tree (richer than flat `.ipynb`) (§1) | UC-84 (enriched: nestable cells); links UC-34 |
|
||||
| Output = structured re-evaluable Wolfram expression, not MIME blob (§1) | UC-55 (enriched: structured value point on opacity spectrum) |
|
||||
| Input cell = computation-defined content (§1) | UC-54 (enriched) |
|
||||
| `Manipulate`/`Dynamic` interactive output = snapshot-only (§2) | links UC-55, foreshadows T5 |
|
||||
| CDF = reduced-runtime interactive distribution projection (§2) | links UC-83 (derivation-projection) |
|
||||
|
||||
Mathematica is a **lineage/reinforcement** dive — it **adds no new UC**, confirming UC-84's
|
||||
notebook page shape as a genus and contributing two refinements (nestable cells; a
|
||||
**structured re-evaluable value** point on the content-opacity spectrum).
|
||||
|
||||
## 6. Architecture notes for SHARD-WP-0002
|
||||
|
||||
- **T12 (page model):** generalize the notebook shape from "ordered cells" (UC-84) to
|
||||
**ordered/nestable cell groups (an outline tree)**; allow code-cell outputs to be **typed
|
||||
structured values**, not only MIME blobs.
|
||||
- **T11/T15 (content opacity / fidelity):** add a **"structured re-evaluable value"** point
|
||||
to the content-opacity spectrum between transparent-text and opaque-blob (Wolfram
|
||||
expression, symbolic result) — relevant to how outputs are stored/surfaced.
|
||||
- **T16 (projection):** CDF is a **reduced-capability interactive projection**; with
|
||||
`Manipulate`/`Dynamic`, static projection is a **snapshot frame only** (join the
|
||||
live-content limit recorded for Strudel T5).
|
||||
|
||||
## 7. Open questions
|
||||
|
||||
1. Is a **structured re-evaluable result** (Wolfram expression) modeled as a typed value in
|
||||
the page model, or stored opaquely with provenance like other computed outputs? (Ties
|
||||
UC-55 open-Q #10 and UC-84 Q3.)
|
||||
2. Do interactive outputs (`Manipulate`, `Dynamic`, and Jupyter widgets) deserve a shared
|
||||
**"interactive, snapshot-only" content marker** in the contract? (Recurs at T4/T5.)
|
||||
|
||||
## 8. Sources
|
||||
|
||||
- Wolfram documentation: notebook format (`Notebook`/`Cell` expressions), cell types &
|
||||
groups, evaluation (`In`/`Out`), `Manipulate`/`Dynamic`, CDF.
|
||||
- prior: `research/260614-jupyter-deep-dive/` (UC-84 notebook shape; the descendant);
|
||||
`research/260614-literate-programming-deep-dive/` (derivation-projection, UC-83).
|
||||
|
||||
## 9. Traceability
|
||||
|
||||
**No new UC** (lineage/reinforcement). Reinforced: UC-84; enriched: UC-54, UC-55; links
|
||||
UC-34, UC-83. Architecture cross-refs: SHARD-WP-0002 T12 (nestable cell-group outline; typed
|
||||
structured outputs), T11/T15 (structured-re-evaluable-value point on the opacity spectrum),
|
||||
T16 (CDF reduced-capability projection; interactive = snapshot-only).
|
||||
15
research/260614-mojomojo-deep-dive/README.md
Normal file
15
research/260614-mojomojo-deep-dive/README.md
Normal file
@@ -0,0 +1,15 @@
|
||||
# 260614 — MojoMojo deep dive
|
||||
|
||||
Deep dive on **MojoMojo**: a **Perl Catalyst / DBIx::Class** DB-backed wiki — hierarchical
|
||||
pages, attachments, inline (AJAX) editing, Markdown content, and **page history in
|
||||
relational version tables**. The classic **MVC DB-backed** contrast to the file-store
|
||||
classics: no file store, no real content API → attach by reading the **relational store
|
||||
directly**.
|
||||
|
||||
- `findings.md` — architecture, the DB schema shape, capability profile, INTENT mapping, UC
|
||||
seed (UC-81), architecture notes for SHARD-WP-0002, open questions, sources, traceability.
|
||||
|
||||
Catalog yield: UC-81 (attach a **DB-backed wiki with no file store/API** by reading its
|
||||
relational store directly — pages + version tables — and importing DB-resident history to the
|
||||
journal). Enriched UC-02/40/36/34. Feeds SHARD-WP-0002 T13 (history portability), T14
|
||||
(direct-DB binding).
|
||||
133
research/260614-mojomojo-deep-dive/findings.md
Normal file
133
research/260614-mojomojo-deep-dive/findings.md
Normal file
@@ -0,0 +1,133 @@
|
||||
# MojoMojo — deep dive (findings)
|
||||
|
||||
**Date:** 2026-06-14 · **Source:** SHARD-WP-0003 T8 · **Subject:** MojoMojo, a Perl
|
||||
Catalyst wiki/CMS.
|
||||
|
||||
## Why this dive
|
||||
|
||||
The file-store classics (TWiki, Foswiki, Oddmuse, UseMod) keep pages as files; the modern
|
||||
SaaS keep them behind APIs. MojoMojo is the **classic relational-DB-backed** wiki — a
|
||||
Catalyst MVC app over **DBIx::Class** with pages and their history in **SQL tables**, and
|
||||
**no file store and no first-class content API**. It anchors the *"attach by reading the
|
||||
database directly"* hard case the adapter contract must account for (T13/T14).
|
||||
|
||||
## 1. Architecture
|
||||
|
||||
- **Stack:** Perl **Catalyst** (MVC web framework) + **DBIx::Class** (ORM) over a relational
|
||||
DB (SQLite / PostgreSQL / MySQL). Templating via Template Toolkit.
|
||||
- **Content:** **Markdown** (Text::MultiMarkdown) is the page markup — so the *body* is
|
||||
Markdown, but it lives **in a DB column**, not a file.
|
||||
- **Pages are hierarchical:** a **path tree** (`/parent/child`) modeled as rows with
|
||||
parent/lineage relations — structure is relational, not directory-based.
|
||||
- **Versioning:** each page edit creates a **new version row** (a `page_version`-style
|
||||
table) — full revision history lives in **DB version tables**, with author/timestamp.
|
||||
- **Features:** inline **AJAX editing**, **attachments** (stored as DB rows / blobs +
|
||||
metadata), diffs, RSS feeds, full-text search, per-page permissions.
|
||||
|
||||
## 2. The attach problem — DB or nothing
|
||||
|
||||
MojoMojo exposes its content through the **web app** (HTML) and the **database**; there is
|
||||
**no clean REST/GraphQL content API** and **no file store**. So a shard adapter has two
|
||||
realistic paths:
|
||||
|
||||
1. **Direct relational read** (preferred): read the `page` + `page_version` (+ `content`,
|
||||
`attachment`) tables via DBIx::Class schema — pages, the path tree, and **full history**
|
||||
are all there, importable to the coordination journal (UC-41-style history import, but
|
||||
from **DB version rows** rather than RCS/git).
|
||||
2. **HTML scrape** (fallback): parse rendered pages — lossy, last resort.
|
||||
|
||||
This makes MojoMojo the **direct-DB-read** binding archetype: the canonical store is a
|
||||
relational schema, and the adapter's job is to map that schema to the wiki page model +
|
||||
journal.
|
||||
|
||||
## 3. Capability profile
|
||||
|
||||
| Dimension (synthesis spectrum) | MojoMojo |
|
||||
|--------------------------------|----------|
|
||||
| Attachment mode | **direct DB read** (relational); HTML-scrape fallback; no file store, no API |
|
||||
| Addressing granularity | page (row); path tree via lineage rows |
|
||||
| Content identity | DB page id; path as human key |
|
||||
| Identity vs placement | row id vs path lineage (separable) |
|
||||
| Structure | **relational**: page rows + parent/lineage; attachments as rows |
|
||||
| History | **DB version tables** (per-edit version rows, author/timestamp) |
|
||||
| Merge model | app-level last-writer; DB transactions |
|
||||
| Native query | SQL over the schema (not a wiki query language) |
|
||||
| Translation | **Markdown body in a DB column** — minimal translation, but extraction needed |
|
||||
| Write granularity | page (row) per save |
|
||||
| Operational envelope | a Perl app + its DB; direct DB access needs credentials |
|
||||
| Access grant | per-page permissions in DB; app auth |
|
||||
| Content opacity | transparent if you can read the DB |
|
||||
| Provenance | author/timestamp on version rows |
|
||||
|
||||
## 4. INTENT mapping
|
||||
|
||||
### Reinforcements
|
||||
|
||||
- **Backend-neutral page model**: the body is **Markdown** — once extracted from the DB
|
||||
column it maps directly; the adapter's work is **schema→page-model**, not format
|
||||
translation.
|
||||
- **History portability** (T13): DB **version rows** are a third history-source shape beside
|
||||
git commits and RCS files — importable to the journal as discrete revisions with
|
||||
author/timestamp.
|
||||
- **Graceful degradation**: even with only DB read (no API), MojoMojo is a usable
|
||||
read/projection/backup shard; with DB write it could be write-through, but carefully
|
||||
(app invariants).
|
||||
|
||||
### Divergences (boundaries / notes)
|
||||
|
||||
- **No file store, no API** ⇒ the **direct-DB-read** binding is a first-class attach mode the
|
||||
contract must name (alongside file-store, in-engine host, external-API, CRDT, P2P) — or a
|
||||
sub-mode of "external store" where the medium is **a relational schema** (T14). Reading a
|
||||
third-party app's DB is **coupling to its schema** (versioned, may drift across MojoMojo
|
||||
versions) — a stated risk (UC-43 backend-swap analogue at the schema level).
|
||||
- **Writing by direct DB** risks violating app invariants (lineage, version counters,
|
||||
search index) — default to **read/projection/overlay**; write-through only with the app's
|
||||
cooperation.
|
||||
|
||||
### What to keep
|
||||
|
||||
1. **Direct-DB-read as a named binding** for DB-backed engines with no file/API (UC-81),
|
||||
mapping a **relational schema → wiki page model + journal**.
|
||||
2. **DB version rows as a history source** for the journal (T13), beside git and RCS.
|
||||
3. **Schema-coupling caution** — treat the schema as a versioned interface that can drift
|
||||
(relates UC-43).
|
||||
|
||||
## 5. UC seed
|
||||
|
||||
| # | Seed | Disposition |
|
||||
|---|------|-------------|
|
||||
| UC-81 | Attach a **DB-backed wiki with no file store / no API** (MojoMojo) by reading its **relational store directly** (page + version tables), mapping schema → page model and **importing DB-resident history** to the journal | **new** |
|
||||
| — | DB attach vs file attach | enrich **UC-02** / **UC-40** |
|
||||
| — | DB version-table history import | enrich **UC-36** |
|
||||
| — | relational page rows / lineage as structure | enrich **UC-34** |
|
||||
|
||||
## 6. Architecture notes for SHARD-WP-0002
|
||||
|
||||
- **T14 (binding):** add **direct relational read** as a binding (or external-store sub-mode
|
||||
whose medium is a SQL schema) for DB-backed engines lacking a file store or API; HTML
|
||||
scrape is the lossy fallback. Schema is a **versioned coupling** (drift risk, UC-43).
|
||||
- **T13 (history portability):** **DB version rows** = a history source alongside git commits
|
||||
and RCS revisions — import as discrete journal entries (author/timestamp).
|
||||
- **T11 (capability):** "has-file-store" / "has-API" are **absent** here; "has-readable-DB"
|
||||
is the capability — a sparse profile relying on schema knowledge.
|
||||
|
||||
## 7. Open questions
|
||||
|
||||
1. Does shard-wiki sanction **direct third-party DB reads** as a binding, or restrict them
|
||||
(schema coupling/drift) to a documented best-effort mode? How is schema drift across
|
||||
MojoMojo versions handled (UC-43)?
|
||||
2. Is **write-through by direct DB** ever allowed (risking app invariants), or are DB-backed
|
||||
no-API engines read/projection/overlay/backup only?
|
||||
|
||||
## 8. Sources
|
||||
|
||||
- MojoMojo — github.com/mojomojo/mojomojo; metacpan MojoMojo (Catalyst app, DBIx::Class
|
||||
schema: Page / PageVersion / Content / Attachment)
|
||||
- Catalyst + DBIx::Class framework docs (architecture context)
|
||||
- prior: `research/260613-twiki-deep-dive/` (file-store classic contrast, UC-40/41)
|
||||
|
||||
## 9. Traceability
|
||||
|
||||
New UC **UC-81** carries the marker **⊙** in the wikiengines column of
|
||||
`spec/UseCaseCatalog.md`. Enriched: UC-02, UC-40, UC-36, UC-34. Architecture cross-refs:
|
||||
SHARD-WP-0002 T14 (direct-DB binding), T13 (DB version-row history), T11.
|
||||
13
research/260614-oddmuse-deep-dive/README.md
Normal file
13
research/260614-oddmuse-deep-dive/README.md
Normal file
@@ -0,0 +1,13 @@
|
||||
# 260614 — Oddmuse deep dive
|
||||
|
||||
Deep dive on **Oddmuse**: a deliberately minimal wiki — a **single Perl CGI script** over
|
||||
**plain-text page files** (one file per page, old revisions in a `keep/` dir), no database.
|
||||
The low-complexity **file-store floor** — useful as the **minimal-adapter / graceful-
|
||||
degradation baseline** of the capability profile.
|
||||
|
||||
- `findings.md` — the minimal model, storage layout, capability profile, INTENT mapping, UC
|
||||
seed (UC-82), architecture notes for SHARD-WP-0002, open questions, sources, traceability.
|
||||
|
||||
Catalog yield: UC-82 (attach a **minimal flat-file wiki** — plain-text page files + a simple
|
||||
revision dir — as the **graceful-degradation baseline / minimal capability-profile floor**).
|
||||
Enriched UC-40/01/36/41. Feeds SHARD-WP-0002 T11 (minimal capability profile).
|
||||
124
research/260614-oddmuse-deep-dive/findings.md
Normal file
124
research/260614-oddmuse-deep-dive/findings.md
Normal file
@@ -0,0 +1,124 @@
|
||||
# Oddmuse — deep dive (findings)
|
||||
|
||||
**Date:** 2026-06-14 · **Source:** SHARD-WP-0003 T7 · **Subject:** Oddmuse, Alex Schroeder's
|
||||
minimal single-script wiki (EmacsWiki, Community Wiki).
|
||||
|
||||
## Why this dive
|
||||
|
||||
After the structure/graph/SaaS far-ends, Oddmuse anchors the **opposite** corner: the
|
||||
*minimal* file-store wiki. It is the reference for shard-wiki's **graceful-degradation**
|
||||
promise — *a limited backend must still be usable* — and defines the **floor** of the
|
||||
capability profile (T11): what the simplest possible real wiki looks like as a shard.
|
||||
|
||||
## 1. The minimal model
|
||||
|
||||
- **One Perl CGI script** (`wiki.pl`) is the whole engine — drop it on any CGI host. No
|
||||
framework, no database, minimal dependencies.
|
||||
- **Plain-text page files**: each page is a text file in a **page directory** (`page/`),
|
||||
with a small header of metadata and the body; **old revisions** are kept in a **`keep/`**
|
||||
directory (recent history retained, older optionally expired).
|
||||
- **Locking** via lock files; edits append a new keep-revision.
|
||||
- **Markup:** simple wiki markup; **CamelCase** and **free links** `[[Like This]]`;
|
||||
InterWiki/near-links; tags and "clusters."
|
||||
- **No DB, no API** (beyond the CGI itself); content *is* the files on disk.
|
||||
|
||||
## 2. The shard view — the file-store floor
|
||||
|
||||
Because pages are **plain-text files on disk**, Oddmuse is **trivially attachable** as a
|
||||
**file-store shard** even though the engine offers nothing fancy:
|
||||
|
||||
- Read the `page/` files → pages (parse the tiny header + body).
|
||||
- Read `keep/` → recent revision history (import to the journal; note it may be **truncated**
|
||||
— older revisions can be expired, so history is *partial*).
|
||||
- Write = write a page file + a keep-revision (respecting the lock) — but the engine's own
|
||||
invariants (indexes) mean **write-through is best done via the engine or carefully**.
|
||||
|
||||
This is the **minimal capability profile**: file-store, page-granularity, plain-text,
|
||||
possibly-truncated history, no query, no structured fields, open editing. Everything richer
|
||||
in the synthesis matrix is measured *against this floor*.
|
||||
|
||||
## 3. Capability profile (the floor)
|
||||
|
||||
| Dimension (synthesis spectrum) | Oddmuse |
|
||||
|--------------------------------|---------|
|
||||
| Attachment mode | **file-store** (plain-text files); CGI, no API |
|
||||
| Addressing granularity | page = file |
|
||||
| Content identity | page name = filename |
|
||||
| Identity vs placement | name-bound |
|
||||
| Structure | none beyond tags/clusters; flat page space |
|
||||
| History | **`keep/` revisions — recent, possibly truncated** |
|
||||
| Merge model | lock-file; last-writer |
|
||||
| Native query | none |
|
||||
| Translation | simple wiki markup (→ Markdown translation needed) |
|
||||
| Write granularity | page (file) |
|
||||
| Operational envelope | a CGI script; tiny |
|
||||
| Access grant | open by default (optional password) |
|
||||
| Content opacity | transparent text |
|
||||
| Provenance | minimal (timestamp, optional username) |
|
||||
|
||||
## 4. INTENT mapping
|
||||
|
||||
### Reinforcements
|
||||
|
||||
- **Graceful degradation** (INTENT): Oddmuse is the *definition* of the limited-backend case
|
||||
— still a perfectly good read/projection/overlay/backup shard via its files.
|
||||
- **Union without erasure**: even a minimal shard contributes pages with provenance; its
|
||||
**truncated history** must be surfaced honestly (don't imply full history when `keep/` is
|
||||
partial).
|
||||
- **Open wiki** (UC-01): Oddmuse is open-editing by default — the c2-era ethos.
|
||||
- **Markdown-first but backend-neutral**: its wiki markup needs translation to the
|
||||
Markdown-first page model (UC-42-style), a small lossy step.
|
||||
|
||||
### Divergences (boundaries / notes)
|
||||
|
||||
- **Partial history**: `keep/` may expire old revisions — the journal import must record that
|
||||
history is **truncated/partial**, not complete (a freshness/provenance honesty point).
|
||||
- **Minimal profile** means many capabilities are simply **absent** — the adapter advertises
|
||||
a sparse profile; the orchestrator must not assume query/structure/locking semantics it
|
||||
doesn't have (T11 capability-awareness in its purest form).
|
||||
|
||||
### What to keep
|
||||
|
||||
1. **Minimal flat-file wiki as the graceful-degradation baseline** (UC-82): plain-text files
|
||||
+ simple revision dir = the floor every richer profile extends.
|
||||
2. **Honest partial-history reporting** when a shard's revision store is truncated.
|
||||
3. **Sparse capability profile** handling — absence of a capability is first-class (T11).
|
||||
|
||||
## 5. UC seed
|
||||
|
||||
| # | Seed | Disposition |
|
||||
|---|------|-------------|
|
||||
| UC-82 | Attach a **minimal flat-file wiki** (plain-text page files + a simple revision dir, Oddmuse) as the **graceful-degradation baseline / minimal capability-profile floor**, surfacing **partial history** honestly | **new** |
|
||||
| — | plain-text file-store at the simple end | enrich **UC-40** |
|
||||
| — | open-editing wiki | enrich **UC-01** |
|
||||
| — | `keep/` plain-text revision history (possibly truncated) | enrich **UC-36** / **UC-41** |
|
||||
|
||||
## 6. Architecture notes for SHARD-WP-0002
|
||||
|
||||
- **T11 (capability model):** Oddmuse defines the **minimal/floor profile** — file-store,
|
||||
page granularity, plain-text, **partial** history, no query/structure. Validate that the
|
||||
capability vocabulary can express **absence** cleanly and that the orchestrator degrades
|
||||
to read/projection/overlay/backup against it.
|
||||
- **History portability (T13):** `keep/` revisions import as journal entries but may be
|
||||
**truncated** — record completeness as metadata (full vs partial history).
|
||||
|
||||
## 7. Open questions
|
||||
|
||||
1. How does shard-wiki represent a shard with **partial/truncated history** in the journal
|
||||
and provenance UI (UC-24) — explicit "history begins at" marker?
|
||||
2. Is write-through to a minimal CGI wiki (write page + keep-revision under its lock) ever
|
||||
sanctioned, or read/projection/overlay/backup only by default?
|
||||
|
||||
## 8. Sources
|
||||
|
||||
- oddmuse.org — Oddmuse wiki (single-script install, `page/` + `keep/` storage, markup,
|
||||
CamelCase/free links, clusters/tags)
|
||||
- EmacsWiki / Community Wiki (Oddmuse in production)
|
||||
- prior: `research/260608-c2-wiki-origins/` (open-wiki ethos); `research/260613-twiki-deep-dive/`
|
||||
(file-store + RCS contrast)
|
||||
|
||||
## 9. Traceability
|
||||
|
||||
New UC **UC-82** carries the marker **⊚** in the wikiengines column of
|
||||
`spec/UseCaseCatalog.md`. Enriched: UC-40, UC-01, UC-36, UC-41. Architecture cross-refs:
|
||||
SHARD-WP-0002 T11 (minimal/floor profile), T13 (partial-history import).
|
||||
30
research/260614-processing-deep-dive/README.md
Normal file
30
research/260614-processing-deep-dive/README.md
Normal file
@@ -0,0 +1,30 @@
|
||||
# 260614 — Processing & Processing.js deep dive
|
||||
|
||||
Date: 2026-06-14 · Source: **SHARD-WP-0004 T4**
|
||||
|
||||
## What this is
|
||||
|
||||
A deep dive into **Processing** (creative coding) and **Processing.js / p5.js** (browser-run
|
||||
sketches): **the sketch *is* the document** — a program whose presentation is **live visual
|
||||
output rendered at view time** in the browser, with **no cached output** by default.
|
||||
|
||||
## Why it matters
|
||||
|
||||
- The cleanest **program-as-page** case: canonical content = **source text**, presentation =
|
||||
**executable render** (no input/output cells, no prose envelope) — sharpens the page model
|
||||
(T12/T15) and UC-54/55.
|
||||
- Adds a **view-time** variant to derivation-projection (the render runs **in the viewer,
|
||||
continuously**) and a **continuity** facet (one-shot vs continuous/interactive); continuous
|
||||
→ static is a **snapshot frame/recording** on the live↔snapshot axis (T6).
|
||||
- "Execute/render in the viewer" = an explicit **capability + trust/sandbox** surface.
|
||||
|
||||
## Yield
|
||||
|
||||
- **No new UC** (enrichment / design prior art). Enriches **UC-54, UC-55**; links UC-83,
|
||||
UC-84, UC-35.
|
||||
|
||||
## Contents
|
||||
|
||||
| Path | Role |
|
||||
|------|------|
|
||||
| `findings.md` | Program-as-page, view-time/live render, no-cached-output, capability+trust, INTENT mapping, UC disposition (enrichment-only), architecture notes |
|
||||
114
research/260614-processing-deep-dive/findings.md
Normal file
114
research/260614-processing-deep-dive/findings.md
Normal file
@@ -0,0 +1,114 @@
|
||||
# Processing & Processing.js — deep dive (findings)
|
||||
|
||||
**Date:** 2026-06-14 · **Source:** SHARD-WP-0004 T4 · **Subject:** Processing (creative
|
||||
coding) and Processing.js / p5.js — **the sketch *is* the document**, rendered live at view
|
||||
time in the browser.
|
||||
|
||||
## Why this dive
|
||||
|
||||
Low-priority, enrichment-focused. Processing tests a page-model edge the notebooks (T2/T3)
|
||||
didn't: a page that is **wholly a program whose presentation *is* its running output** — no
|
||||
separate input/output cells, no prose envelope. The defining question: what is a "page" when
|
||||
**the rendered form is a live computation evaluated at view time**? It feeds UC-54
|
||||
(computation-defined content) and UC-55 (non-Markdown / executable content) and sharpens the
|
||||
**live↔snapshot** axis named at T6.
|
||||
|
||||
## 1. Program-as-page
|
||||
|
||||
- A **Processing sketch** is a program (`setup()` + `draw()`) whose output is a **canvas of
|
||||
visual/animated/interactive content**. There are no "output cells" — **the program's
|
||||
execution is the content**.
|
||||
- **Processing.js** (Resig, 2008; now largely **p5.js**) runs sketches **client-side in the
|
||||
browser** on `<canvas>`. The page ships the **source**; the **rendering happens at view
|
||||
time** in the reader's browser — no server, no pre-rendered artifact required.
|
||||
- So the durable artifact is **text (the sketch source)**; the *presentation* is a **live,
|
||||
view-time derivation** of that source (a derivation-projection, T1) — with the twist that
|
||||
the derivation runs **in the viewer**, continuously (animation/interaction), not once.
|
||||
|
||||
## 2. The view-time / live-render dimension
|
||||
|
||||
This adds a wrinkle beyond Jupyter's *captured* outputs (UC-84) and Mathematica's CDF:
|
||||
|
||||
- **No captured output at all by default** — unlike a notebook, a sketch typically stores
|
||||
**only source**; there is nothing cached. The output exists only when executed.
|
||||
- **Continuous & interactive** — `draw()` loops; mouse/keyboard drive it. The output is
|
||||
**time-based and interaction-based**, so any static capture is a **single frame** (or a
|
||||
recording) — it cannot represent the artifact faithfully (shared limit with Mathematica
|
||||
`Dynamic` and, more extremely, Strudel T5).
|
||||
- **Client-side execution = a capability + trust surface** — running arbitrary sketch code in
|
||||
the viewer is an execution capability with sandboxing concerns; shard-wiki must treat
|
||||
"render live in the viewer" as an explicit, gated capability, never an automatic behavior.
|
||||
|
||||
## 3. INTENT mapping (enrichment-only — no new UC)
|
||||
|
||||
### Reinforcements / refinements
|
||||
|
||||
- **Executable-as-page (UC-54/55).** Processing is the cleanest **"the whole page is the
|
||||
program"** case: content = source text, presentation = view-time live render. Strengthens
|
||||
the page model's need to represent **executable content whose rendered form is derived at
|
||||
view time**, distinct from notebooks (which interleave cells + captured outputs).
|
||||
- **Derivation-projection, view-time variant (T1/UC-83).** The render is a derivation-
|
||||
projection that runs **in the viewer, continuously** — extends the projection model: a
|
||||
derivation may be *materialized ahead* (CDF/nbconvert) or *run at view time* (sketch).
|
||||
- **Live↔snapshot axis (T6).** With **no cached output** and **continuous/interactive**
|
||||
rendering, the only static form is a **snapshot frame or recording** — a concrete point on
|
||||
the live↔snapshot axis; honest treatment offers the snapshot and **marks it as a frame of a
|
||||
live artifact**, never as the artifact.
|
||||
- **Capability + trust gating.** "Execute in the viewer" is an explicit capability with a
|
||||
sandbox/trust boundary — mechanism-over-policy: whether to run, sandbox, or only snapshot
|
||||
stays configurable.
|
||||
|
||||
### Boundaries
|
||||
|
||||
- shard-wiki is **not a sketch runtime**; default is **attach the source + offer a captured
|
||||
snapshot/recording**; live in-viewer rendering is a gated capability. Source is canonical;
|
||||
the render is a degrading, view-time projection.
|
||||
|
||||
## 4. UC disposition (enrichment-only)
|
||||
|
||||
| Mechanism (findings §) | Catalog UC / thread |
|
||||
|------------------------|---------------------|
|
||||
| Sketch = program-as-page; presentation = view-time live render (§1) | UC-54 (enriched: executable/view-time content), UC-55 (enriched: non-Markdown executable) |
|
||||
| Render = derivation-projection run **in the viewer, continuously** (§1, §2) | links UC-83 (view-time variant) |
|
||||
| No cached output; continuous/interactive → static = snapshot frame only (§2) | links UC-84, live↔snapshot axis (T6) |
|
||||
| Client-side execution = capability + trust/sandbox surface (§2) | links UC-35 (capability-awareness) |
|
||||
|
||||
**No new UC** — Processing is design prior art reinforcing executable-as-page; its
|
||||
contribution is the **view-time, no-cached-output, continuous-render** point on the
|
||||
projection/liveness model.
|
||||
|
||||
## 5. Architecture notes for SHARD-WP-0002
|
||||
|
||||
- **T12/T15 (page model):** represent **program-as-page** — a page whose canonical content is
|
||||
**source text** and whose presentation is an **executable render**; no inherent cached
|
||||
output (contrast notebooks). Non-Markdown executable content type.
|
||||
- **T16 (projection):** add a **materialization timing** facet to derivation-projection:
|
||||
**ahead-of-time** (CDF/nbconvert/static HTML) vs **at view time** (sketch render); and a
|
||||
**continuity** facet: one-shot vs continuous/interactive. Continuous → static is a snapshot
|
||||
frame/recording on the live↔snapshot axis (T6).
|
||||
- **T11 (capabilities):** "execute/render in the viewer" is a capability with a **trust/
|
||||
sandbox** sub-concern; default off → snapshot.
|
||||
|
||||
## 6. Open questions
|
||||
|
||||
1. Is **view-time execution** ever offered (sandboxed in-viewer render), or does shard-wiki
|
||||
always degrade program-as-page to a captured snapshot/recording? (Capability/trust policy.)
|
||||
2. Should **materialization timing** (ahead-of-time vs view-time) and **continuity**
|
||||
(one-shot vs continuous) be explicit projection metadata, alongside the live↔snapshot
|
||||
axis? (Recurs at T5.)
|
||||
|
||||
## 7. Sources
|
||||
|
||||
- `processing.org`; **p5.js** (`p5js.org`); Processing.js (Resig) history; `<canvas>` /
|
||||
client-side rendering model.
|
||||
- prior: `research/260614-jupyter-deep-dive/` (captured vs no-cached output, UC-84);
|
||||
`research/260614-mathematica-deep-dive/` (`Dynamic`/CDF, snapshot-only);
|
||||
`research/260614-squeak-pharo-deep-dive/` (live↔snapshot axis).
|
||||
|
||||
## 8. Traceability
|
||||
|
||||
**No new UC** (enrichment / design prior art). Enriched: UC-54, UC-55; links UC-83 (view-time
|
||||
derivation-projection), UC-84 (no-cached-output contrast), UC-35 (execute capability +
|
||||
trust). Architecture cross-refs: SHARD-WP-0002 T12/T15 (program-as-page, source-canonical/
|
||||
render-derived), T16 (materialization-timing + continuity facets on the live↔snapshot axis),
|
||||
T11 (view-time-execute capability + sandbox).
|
||||
15
research/260614-quip-deep-dive/README.md
Normal file
15
research/260614-quip-deep-dive/README.md
Normal file
@@ -0,0 +1,15 @@
|
||||
# 260614 — Salesforce Quip deep dive
|
||||
|
||||
Deep dive on **Salesforce Quip**: a closed SaaS of **live collaborative documents** with
|
||||
**embedded spreadsheets and live apps**, a REST API (HTML import/export), and
|
||||
**Salesforce-tied identity/permissions**. A hosted-collab contrast to Notion: the
|
||||
document+spreadsheet hybrid under enterprise auth.
|
||||
|
||||
- `findings.md` — the doc/spreadsheet model, REST API, enterprise auth, capability profile,
|
||||
INTENT mapping, UC seed (UC-80), architecture notes for SHARD-WP-0002, open questions,
|
||||
sources, traceability.
|
||||
|
||||
Catalog yield: UC-80 (attach a SaaS live-doc shard whose pages **mix prose + embedded live
|
||||
structured objects** — spreadsheets/apps — via REST with lossy HTML import/export, under
|
||||
enterprise identity). Enriched UC-57/55/58/36/06. Feeds SHARD-WP-0002 T11 (capability /
|
||||
content opacity), T14 (external-API binding).
|
||||
148
research/260614-quip-deep-dive/findings.md
Normal file
148
research/260614-quip-deep-dive/findings.md
Normal file
@@ -0,0 +1,148 @@
|
||||
# Salesforce Quip — deep dive (findings)
|
||||
|
||||
**Date:** 2026-06-14 · **Source:** SHARD-WP-0003 T6 · **Subject:** Quip (Salesforce), a
|
||||
collaborative-document SaaS.
|
||||
|
||||
## Why this dive
|
||||
|
||||
Notion (UC-57/58/59) gave us the closed-SaaS, external-REST-only, database-as-pages model.
|
||||
Quip is the **enterprise document-collab** contrast: a **document+spreadsheet hybrid** where
|
||||
a page is **prose interleaved with embedded live structured objects** (spreadsheets,
|
||||
calendars, kanban "live apps"), reachable only through a REST API, gated by **Salesforce
|
||||
identity**. The question: how does shard-wiki attach a shard whose "page" is a *mixed
|
||||
prose+live-object document* behind enterprise auth?
|
||||
|
||||
## 1. The document + live-object model
|
||||
|
||||
- A Quip **document** is a real-time collaborative doc (concurrent editing). Its body is
|
||||
rich content — headings, lists, prose — **interleaved with embedded objects**:
|
||||
- **Spreadsheets** are *first-class, inline* — a doc can contain live spreadsheet sections
|
||||
with formulas, not just static tables.
|
||||
- **Live apps** (calendars, kanban boards, project trackers, polls) embed interactive
|
||||
structured widgets inside the document.
|
||||
- **Folders** organize documents; **threads** (docs and chat are unified — every doc is also
|
||||
a conversation thread) carry **messages/comments** inline.
|
||||
- So a Quip "page" is a **hybrid**: prose + embedded structured/live content in one
|
||||
document, with conversation attached. Not Markdown; a proprietary rich model.
|
||||
|
||||
## 2. The REST API (the only door)
|
||||
|
||||
Quip exposes a **REST API**: threads/documents (get, create, **edit-document** with an HTML
|
||||
fragment + a location/section anchor), folders, messages, users, and spreadsheet cell
|
||||
access. Content crosses the API as **HTML** (import and export) — there is **no native
|
||||
Markdown** and no file/git access. Implications:
|
||||
|
||||
- **Import/export is HTML → lossy** to Markdown (like Notion's export, UC-59): embedded
|
||||
spreadsheets and live apps **do not round-trip** to Markdown cleanly — they degrade to
|
||||
tables or links.
|
||||
- Editing is **section/anchor-scoped HTML splice** (`edit-document` targets a location) — a
|
||||
mid-granularity write (a section, not the whole doc, not a character).
|
||||
- **Rate-limited** like any SaaS API; history is internal (revisions exist in-product, with
|
||||
limited API exposure).
|
||||
|
||||
## 3. Enterprise identity / permissions
|
||||
|
||||
Quip is tied to **Salesforce** (acquired 2016): authentication and access run through the
|
||||
Salesforce org / Quip site; **folder and document sharing ACLs** govern visibility, with
|
||||
enterprise SSO. So this is an **enterprise-ACL, authn-delegated** shard (UC-06) — shard-wiki
|
||||
honors Salesforce-side permissions and never bypasses them.
|
||||
|
||||
## 4. Capability profile
|
||||
|
||||
| Dimension (synthesis spectrum) | Quip |
|
||||
|--------------------------------|------|
|
||||
| Attachment mode | **external-API** (REST; HTML import/export) — closed SaaS |
|
||||
| Addressing granularity | document; **section/anchor** for edits; spreadsheet cell |
|
||||
| Content identity | Quip thread/document ID (opaque) |
|
||||
| Identity vs placement | API-id identity; folder placement separate |
|
||||
| Structure | **hybrid doc**: prose + embedded spreadsheets + live apps + thread |
|
||||
| History | **internal** revisions (limited API exposure) |
|
||||
| Merge model | server-side real-time collab (OT-like); no external merge |
|
||||
| Native query | none exposed (spreadsheet formulas internal) |
|
||||
| Translation | **HTML** in/out → **lossy** to Markdown (spreadsheets/apps degrade) |
|
||||
| Write granularity | **section/anchor HTML splice** (mid) |
|
||||
| Operational envelope | rate-limited SaaS REST |
|
||||
| Access grant | **Salesforce identity + folder/doc ACL** (enterprise SSO) |
|
||||
| Content opacity | proprietary rich model; not E2EE but not transparent files |
|
||||
| Provenance | author/edit metadata in-product; API-limited |
|
||||
|
||||
## 5. INTENT mapping
|
||||
|
||||
### Reinforcements
|
||||
|
||||
- **External-API attachment** (UC-57): Quip is a second concrete instance beside Notion of
|
||||
the closed-SaaS REST-only shard — generalizes the external-API mode (REST + HTML payload,
|
||||
vs Notion's REST + block-JSON, vs Wiki.js GraphQL).
|
||||
- **Union without erasure / no silent flatten**: embedded spreadsheets and live apps must be
|
||||
**surfaced as what they are** with provenance, and the **HTML→Markdown lossiness made
|
||||
explicit** (UC-59) — never silently drop a live spreadsheet to a static table.
|
||||
- **Authz-in-core, authn-delegated** (settled decision): Quip's Salesforce-tied ACL is the
|
||||
enterprise case — honor delegated identity and the shard's ACL (UC-06).
|
||||
- **Graceful degradation**: with only a lossy HTML export, Quip is still usable as a
|
||||
read/projection/overlay-target/backup shard.
|
||||
|
||||
### Divergences (boundaries / notes)
|
||||
|
||||
- **Mixed prose+live-object page** is a content shape beyond "Markdown body + frontmatter":
|
||||
the page model must allow **embedded typed/live objects within a prose page** (not just a
|
||||
whole-page-is-a-record like Notion DB, but *inline* structured content) — feeds T12 and the
|
||||
non-Markdown-content question (UC-55).
|
||||
- **HTML is the only interchange** — no Markdown, no files, no git. Content opacity is
|
||||
"proprietary-but-exportable": transparent-ish via lossy HTML, not via files (T11
|
||||
content-opacity tier between transparent-files and E2EE-opaque).
|
||||
- **Write granularity = section-anchor HTML splice** — a mid tier (between whole-file and
|
||||
block/character) realized over an API, distinct from fedwiki's story-item op-log.
|
||||
|
||||
### What to keep
|
||||
|
||||
1. **External-API mode generalized** to carry an **HTML payload** variant (Quip) beside
|
||||
block-JSON (Notion) and GraphQL (Wiki.js) — capability/payload-format is part of the
|
||||
adapter profile (UC-57/UC-80).
|
||||
2. **Inline embedded live/structured objects** as a page-model element (prose + embedded
|
||||
spreadsheet/app), with **explicit lossy projection** to Markdown (UC-55/UC-58/UC-59).
|
||||
3. **Enterprise-ACL + delegated identity** honored, not bypassed (UC-06).
|
||||
|
||||
## 6. UC seed
|
||||
|
||||
| # | Seed | Disposition |
|
||||
|---|------|-------------|
|
||||
| UC-80 | Attach a **SaaS live-doc shard** whose pages **mix prose + embedded live structured objects** (spreadsheets / live apps) via **REST with lossy HTML import/export**, under **enterprise (Salesforce) identity** | **new** |
|
||||
| — | external-API mode w/ HTML payload variant | enrich **UC-57** |
|
||||
| — | inline embedded spreadsheet/live-app = non-Markdown content | enrich **UC-55** |
|
||||
| — | embedded structured objects within a prose page | enrich **UC-58** |
|
||||
| — | internal (API-limited) revision history | enrich **UC-36** |
|
||||
| — | Salesforce-tied enterprise ACL + SSO | enrich **UC-06** |
|
||||
|
||||
## 7. Architecture notes for SHARD-WP-0002
|
||||
|
||||
- **T11 (capability / content opacity):** add a **payload-format** facet to external-API
|
||||
shards (HTML / block-JSON / GraphQL) and a **content-opacity tier** "proprietary but
|
||||
lossy-exportable" between transparent-files and E2EE-opaque. Write granularity =
|
||||
section-anchor splice.
|
||||
- **T12 (page model):** support **inline embedded structured/live objects** within a prose
|
||||
page (not only whole-page-as-record) — with explicit lossy render to Markdown.
|
||||
- **T14 (binding):** external-API binding with **HTML import/export**; honor Salesforce
|
||||
identity/ACL; default to read/projection/overlay given rate limits + lossy export.
|
||||
|
||||
## 8. Open questions
|
||||
|
||||
1. Does shard-wiki represent an **embedded live spreadsheet** as a typed sub-object with
|
||||
provenance (preferred) or flatten it to a static Markdown table (lossy) — and can overlays
|
||||
target a spreadsheet cell via the API, or only the prose?
|
||||
2. Given **HTML-only, lossy** interchange and rate limits, is Quip ever a **write-through**
|
||||
shard, or read/projection/overlay/backup by default? (cf. Notion Q1, UC-57.)
|
||||
3. How is **Salesforce identity** mapped to shard-wiki's delegated-authn model — pass-through
|
||||
token, service account, or per-user OAuth?
|
||||
|
||||
## 9. Sources
|
||||
|
||||
- Quip Automation API (REST) docs — quip.com / Salesforce developer docs (threads/documents,
|
||||
`edit-document`, folders, messages, spreadsheets)
|
||||
- Salesforce Quip product docs (live apps, spreadsheets, sharing/permissions)
|
||||
- prior: `research/260614-notion-deep-dive/` (closed-SaaS external-API contrast, UC-57/59)
|
||||
|
||||
## 10. Traceability
|
||||
|
||||
New UC **UC-80** carries the marker **◧** in the wikiengines column of
|
||||
`spec/UseCaseCatalog.md`. Enriched: UC-57, UC-55, UC-58, UC-36, UC-06. Architecture
|
||||
cross-refs: SHARD-WP-0002 T11 (payload-format + content-opacity), T12 (inline objects), T14.
|
||||
@@ -1,55 +1,69 @@
|
||||
# 260614 — Shard spectrum synthesis (one capability model across the full dive set)
|
||||
|
||||
Date: 2026-06-14 · **revised 2026-06-14 (v2)**
|
||||
Date: 2026-06-14 · **revised 2026-06-14 (v3)**
|
||||
|
||||
## What this is
|
||||
|
||||
A **synthesis** (no new external research) that reads every studied system *across* the
|
||||
others and distills them into a single comparative model feeding the **shard adapter
|
||||
contract** (`SHARD-WP-0002`).
|
||||
contract** and the **federation track** (`SHARD-WP-0002`).
|
||||
|
||||
**v2** extends from nine systems to the full set and grows the spectra from eleven to
|
||||
**thirteen**. The systems: two conceptual ancestors (**Xanadu, ZigZag**), four engines
|
||||
(**XWiki, TWiki, Foswiki** + the wiki-engines landscape), and the modern note/PKB tools
|
||||
(**Roam, Obsidian, Notion, Joplin, Logseq, Anytype, AFFiNE, AppFlowy, Trilium**), against
|
||||
the federation/origin research.
|
||||
**v3** extends the model to the **SHARD-WP-0003 engine batch** (Federated Wiki, Wikibase/
|
||||
Wikidata, the git-forge wikis Gitea/GitLab/GitHub, TiddlyWiki, ikiwiki, Salesforce Quip,
|
||||
MojoMojo, Oddmuse, UseModWiki) — ~23 systems in all. Per-shard spectra grow from thirteen to
|
||||
**fourteen** (adds **provenance granularity**), and a **new coordination-layer axis** is
|
||||
introduced: the **federation-model taxonomy**.
|
||||
|
||||
Centerpieces:
|
||||
- **The shard family matrix** — every candidate backend × {substrate, attach mode(s),
|
||||
addressing, structure, history, **merge**, query, →Markdown, **opacity**}, with Xanadu/
|
||||
ZigZag as the ideal anchors.
|
||||
- **The thirteen capability spectra** — the claim that the adapter contract should model
|
||||
*positions on spectra*, each anchored at both ends by a real system, with federation
|
||||
ops degrading by position. v2 adds **content opacity** (12th) and **merge model**
|
||||
(13th), plus emphasis on **identity vs placement**, and expands the attachment-mode
|
||||
spectrum (file-store native/interchange-mirror, in-engine-host, local-REST,
|
||||
external-API, CRDT-replica, P2P/no-central-endpoint).
|
||||
- **UC-44–UC-67 → workplan task mapping** (T11–T16).
|
||||
addressing, structure, history, merge, query, →Markdown, opacity}, with Xanadu/ZigZag as
|
||||
ideal anchors and Federated Wiki as the federation-model anchor. v3 adds the flat-file
|
||||
floor (Oddmuse/UseModWiki), git-IS-store (forge wikis/ikiwiki), TiddlyWiki, MojoMojo
|
||||
(direct-DB), Quip (external-API/HTML), and Wikibase (typed-graph far-end).
|
||||
- **The fourteen capability spectra** — the claim that the adapter contract should model
|
||||
*positions on spectra*, each anchored at both ends by a real system, with federation ops
|
||||
degrading by position. v3 adds **provenance granularity** (14th: per-shard → per-page →
|
||||
per-edit → per-statement/value), and refines merge-model (+fork/journal-replay,
|
||||
+coexist-with-rank), attachment-mode (+git-IS-store, +container, +direct-DB,
|
||||
+REST/file-store-hybrid, +external-API payload-format facet), native-query (+SPARQL/RDF
|
||||
far-end, +filter mid), history (+DB-version-rows, +partial/truncated), structure
|
||||
(+typed-graph, +inline-embedded objects), content-opacity (+proprietary-lossy-exportable),
|
||||
write-granularity (+story-item, +section-anchor).
|
||||
- **The federation-model taxonomy (§2.5, new)** — federation itself is plural: fork+journal
|
||||
(Federated Wiki), VCS-replication+ping (ikiwiki), query-time graph join (Wikibase SPARQL
|
||||
`SERVICE`), feed aggregation, activity streams, engine-mirror (Wiki.js). A
|
||||
selectable/composable coordination-layer axis feeding **T1–T6**.
|
||||
- **UC-44–UC-82 → workplan task mapping** (T1–T6 + T11–T16).
|
||||
|
||||
## Contents
|
||||
|
||||
| Path | Role |
|
||||
|------|------|
|
||||
| `findings.md` | Family matrix, the thirteen spectra, cross-cutting through-lines, UC→task fold-in, recommendations/decisions, escalated open questions |
|
||||
| `findings.md` | Family matrix, the fourteen spectra, the federation-model taxonomy, cross-cutting through-lines, UC→task fold-in, recommendations/decisions, escalated open questions |
|
||||
|
||||
## Status
|
||||
|
||||
Synthesis v2 complete. No new use cases (consolidation only). Feeds `SHARD-WP-0002`: T11
|
||||
reframed around the **thirteen spectra** (incl. content-opacity + merge-model); T12
|
||||
page-model breadth (prose + typed/**computed** records + non-Markdown assets +
|
||||
query-defined + **multi-placement/DAG identity**); T13 history (incl. CRDT-log =
|
||||
supplement); T14 **full attachment-mode taxonomy** (CRDT-replica + P2P + interchange-mirror
|
||||
+ local-REST added); T15 lossy-with-fidelity-report (incl. HTML); T16 (addressing, content
|
||||
identity, **identity≠placement**, derived index, dimensional/query). UC coverage extended
|
||||
UC-34–UC-59 → **UC-34–UC-67**.
|
||||
Synthesis v3 complete. No new use cases (consolidation only). Feeds `SHARD-WP-0002`:
|
||||
**T1–T6** federation-model taxonomy (selectable/composable federation); **T11** reframed
|
||||
around the **fourteen spectra** (incl. provenance granularity + expanded attachment modes +
|
||||
external-API payload-format facet); T12 page-model breadth (prose + typed/computed records +
|
||||
**inline-embedded objects** + **typed-graph statements** + non-Markdown assets +
|
||||
query-defined + multi-placement/DAG identity); T13 history (DB-rows + partial-flat-file =
|
||||
supplement; completeness metadata; **journal-shaped** coordination journal); T14 **full
|
||||
attachment taxonomy** (git-IS-store / container / direct-DB / REST-file-hybrid +
|
||||
source-of-truth per binding); T15 lossy + **not-Markdown** (graph/HTML); T16 (addressing
|
||||
incl. statement GUID + **opaque stable identity**, identity≠placement, derived index,
|
||||
**graph query / federated SERVICE**). UC coverage extended UC-34–UC-67 → **UC-34–UC-82**.
|
||||
|
||||
**New through-lines (v2):** CRDT changes the merge math (native merge — never impose git
|
||||
merge); identity ≠ placement (Trilium note/branch) is the model for multi-location/
|
||||
multi-shard pages; metadata can be computed (inherited/templated), not just stored;
|
||||
content opacity is per-item, not only whole-shard; the attach surface is not always the
|
||||
native store (Joplin sync mirror; Logseq file-vs-DB and its substrate migration); the
|
||||
block-graph-on-files sweet spot (Logseq `id::`) resolves the addressing tension.
|
||||
**New through-lines (v3):** federation is plural (the federation-model taxonomy); provenance
|
||||
has a granularity spectrum (down to per-statement, Wikibase rank/references); git is both the
|
||||
home store and the home journal (forge wikis make git *the* store, resolving the engine-
|
||||
mirror write-race); the flat-file floor (Oddmuse/UseModWiki, Wikipedia's MediaWiki Phase I)
|
||||
is the field's common root and the minimal capability profile; the page model must also carry
|
||||
typed-graph statements and inline-embedded objects.
|
||||
**Carried from v2:** CRDT changes the merge math; identity ≠ placement; metadata can be
|
||||
computed; content opacity is per-item; the attach surface is not always the native store; the
|
||||
block-graph-on-files sweet spot (Logseq `id::`).
|
||||
**Carried from v1:** files-canonical/index-derived; fine-grained addressing is adoptable;
|
||||
transclusion=clone=embed is one primitive; structure/history federate iff in-text; attach
|
||||
mode is per-binding; Notion proves the platform can enforce no-silent-mutation.
|
||||
</content>
|
||||
transclusion=clone=embed=reference is one primitive; structure/history federate iff in-text;
|
||||
attach mode is per-binding; the platform can enforce no-silent-mutation.
|
||||
|
||||
@@ -1,26 +1,31 @@
|
||||
# Synthesis — the shard spectrum: one capability model across the full dive set
|
||||
|
||||
Date: 2026-06-14 · **revised 2026-06-14 (v2)** — extended from nine systems to the full
|
||||
set (added Joplin, Logseq, the CRDT local-first cohort — Anytype/AFFiNE/AppFlowy — and
|
||||
Trilium); spectra grown from eleven to **thirteen**.
|
||||
Date: 2026-06-14 · **revised 2026-06-14 (v3)** — extended from the fourteen-system set to
|
||||
include the **SHARD-WP-0003 engine batch** (Federated Wiki, Wikibase/Wikidata, the git-forge
|
||||
wikis Gitea/GitLab/GitHub, TiddlyWiki, ikiwiki, Salesforce Quip, MojoMojo, Oddmuse,
|
||||
UseModWiki). Per-shard spectra grown from thirteen to **fourteen** (added **provenance
|
||||
granularity**), with refinements to merge-model, attachment-mode, native-query, structure,
|
||||
history, content-opacity and write-granularity; and a **new coordination-layer axis — the
|
||||
federation-model taxonomy (§2.5)** — the headline v3 contribution.
|
||||
Source kind: **synthesis** — consolidates every deep dive into a single comparative model
|
||||
feeding the **shard adapter contract** (`SHARD-WP-0002` T11–T16)
|
||||
feeding the **shard adapter contract** and the **federation track** (`SHARD-WP-0002`).
|
||||
Lens: shard-wiki — what a backend must expose to participate, expressed as *spectra* of
|
||||
capability rather than a yes/no checklist
|
||||
capability rather than a yes/no checklist.
|
||||
|
||||
> Purpose. Fourteen tools/systems have now been studied (plus the wiki-engines
|
||||
> landscape): two conceptual ancestors (**Xanadu, ZigZag**); four engines (**XWiki,
|
||||
> TWiki, Foswiki** + landscape); and the modern note/PKB tools (**Roam, Obsidian,
|
||||
> Notion, Joplin, Logseq, Anytype, AFFiNE, AppFlowy, Trilium**), against the federation
|
||||
> and origin research. This document reads them *across* each other. The payoff is a
|
||||
> small set of **capability spectra** — now **thirteen** — each anchored at both ends by
|
||||
> a real system, with federation operations degrading by position. That spectrum *is*
|
||||
> the adapter contract's design surface. v2 folds the post-engine use cases (**UC-44–
|
||||
> UC-67**) into the `SHARD-WP-0002` tasks.
|
||||
> Purpose. ~23 tools/systems have now been studied. Two conceptual ancestors (**Xanadu,
|
||||
> ZigZag**); the file/DB engines (**XWiki, TWiki, Foswiki**, + landscape); the modern
|
||||
> note/PKB tools (**Roam, Obsidian, Notion, Joplin, Logseq, Anytype, AFFiNE, AppFlowy,
|
||||
> Trilium**); and the WP-0003 batch (**Federated Wiki, Wikibase, forge wikis, TiddlyWiki,
|
||||
> ikiwiki, Quip, MojoMojo, Oddmuse, UseModWiki**), against the federation and origin
|
||||
> research. This document reads them *across* each other. The payoff is a small set of
|
||||
> **capability spectra** — now **fourteen** per-shard, plus a **federation-model taxonomy**
|
||||
> for the coordination layer — each anchored at both ends by a real system, with federation
|
||||
> operations degrading by position. That spectrum *is* the adapter contract's design
|
||||
> surface. v3 folds the WP-0003 use cases (**UC-68–UC-82**) into the `SHARD-WP-0002` tasks.
|
||||
|
||||
Inputs: `research/260608-{federation-concepts,wikiengines-overview,c2-wiki-origins,yawex-prior-art}`,
|
||||
`research/260613-{xwiki,twiki,foswiki}-deep-dive`,
|
||||
`research/260614-{xanadu,zigzag,roam,obsidian,notion,joplin,logseq,localfirst-workspaces,trilium}-deep-dive`.
|
||||
`research/260614-{xanadu,zigzag,roam,obsidian,notion,joplin,logseq,localfirst-workspaces,trilium,wikijs,federated-wiki,wikibase,forge-wikis,tiddlywiki,ikiwiki,quip,mojomojo,oddmuse,usemodwiki}-deep-dive`.
|
||||
Output target: `spec/TechnicalSpecificationDocument.md` (adapter contract) via
|
||||
`SHARD-WP-0002`.
|
||||
|
||||
@@ -29,238 +34,373 @@ Output target: `spec/TechnicalSpecificationDocument.md` (adapter contract) via
|
||||
## 1. The shard family matrix
|
||||
|
||||
Candidate-shard backends across the dimensions that matter to the contract. (Xanadu and
|
||||
ZigZag are *not* shards — they are the conceptual ideals each column aspires to; listed
|
||||
last as reference anchors.)
|
||||
ZigZag are *not* shards — they are the conceptual ideals each column aspires to; Federated
|
||||
Wiki is *barely* a shard — it is mostly a **federation model**, see §2.5; both listed apart.)
|
||||
|
||||
| Backend | Substrate | Attach mode(s) | Addressing | Structure | History | Merge | Query | →Markdown | Opacity |
|
||||
|---------|-----------|----------------|-----------|-----------|---------|-------|-------|-----------|---------|
|
||||
| **Git folder / repo** | files | file-store | path (+ commit) | flat MD + frontmatter | **git-native** | git/text | no | native | none |
|
||||
| **Git folder / forge wiki** | files (`.wiki.git`) | **file-store (git IS store)** | path (+ commit) | flat MD + frontmatter | **git-native** | git/text | no | native | none |
|
||||
| **ikiwiki** | files (git) → static HTML | file-store (source repo) | path | flat MD + directives | **git-native** | git/text | no | native (src) | none |
|
||||
| **Obsidian** | files | file-store **/** plugin | path + in-file `^id` | frontmatter (in-file) | none (Git plugin) | git/text | plugin (Dataview) | native (OFM) | none |
|
||||
| **Logseq** | files (MD/Org) → SQLite (DB-graph) | file-store **/** plugin | **in-file block `id::`** (sweet spot) | `key::` props (in-file) | none (git) | git/text | **Datalog (derived)** | native-ish | none |
|
||||
| **TWiki / Foswiki** | files + RCS / pluggable | file-store **/** API | path | `%META%` in-file (N records) | **open file (RCS/PlainFile)** | git/text | no | lossless (TML) | none |
|
||||
| **Trilium** | SQLite (one file) | **ETAPI** / scripting / DB | `noteId` + **`branchId`** (id≠place) | labels+relations, **inherited+templated**; **DAG** | internal revisions | conflict-res | attr search | **lossy (HTML)** | **per-note** |
|
||||
| **Joplin** | SQLite-local | **sync-mirror** / local-REST / plugin | page-level `:/id` (store) | notebooks + tags | internal revisions | conflict-notes | search | lossy | E2EE (whole) |
|
||||
| **Logseq** | files (MD/Org) → SQLite | file-store **/** plugin | **in-file block `id::`** (sweet spot) | `key::` props (in-file) | none (git) | git/text | **Datalog (derived)** | native-ish | none |
|
||||
| **TiddlyWiki** | single HTML **/** `.tid` dir | **file-store (container/`.tid`)** | tiddler `title` | typed tiddler **fields** | none / git (.tid) | git/text | **filters** | varies (`type`) | none |
|
||||
| **TWiki / Foswiki** | files + RCS / pluggable | file-store **/** API | path | `%META%` in-file (N records) | **open file (RCS)** | git/text | no | lossless (TML) | none |
|
||||
| **Oddmuse / UseModWiki** | flat files (CGI) | file-store (**minimal floor**) | CamelCase/path | none (flat) | flat-file, **partial** | last-writer | no | lossy | none |
|
||||
| **MojoMojo** | relational DB | **direct-DB-read** | page row / path | relational rows+lineage | **DB version rows** | last-writer | SQL | MD-in-column | none |
|
||||
| **Trilium** | SQLite (one file) | ETAPI / scripting / DB | `noteId` + **`branchId`** (id≠place) | labels+relations, **inherited+templated**; **DAG** | internal revisions | conflict-res | attr search | **lossy (HTML)** | **per-note** |
|
||||
| **Joplin** | SQLite-local | **sync-mirror** / local-REST / plugin | page-level `:/id` | notebooks + tags | internal revisions | conflict-notes | search | lossy | E2EE (whole) |
|
||||
| **Wiki.js** | DB ↔ git mirror | file-store(mirror) / **GraphQL** | path | frontmatter (mirror) | git (via mirror) | git/text | GraphQL | native (mirror) | none |
|
||||
| **XWiki** | DB (Hibernate) | in-engine host / REST | path | **XObjects/XClass** | internal (`xwikircs`) | git/text | yes (XWQL) | engine syntax | none |
|
||||
| **Quip** | hosted (SaaS) | **external-API (HTML)** | doc id + section anchor | **prose + inline spreadsheets/live-apps** | internal | server (OT) | no | **lossy (HTML)** | proprietary |
|
||||
| **Roam** | client DataScript | **in-app host only** | **store UUID** | `key::` attrs | internal (txn log) | (in-app) | yes (Datalog) | (Roam MD) | none |
|
||||
| **Notion** | hosted Postgres | **external API only** | store UUID | **DB schema + relations + rollups** | internal, not portable | internal | yes (DB query) | **lossy** | none |
|
||||
| **Notion** | hosted Postgres | **external-API (block-JSON)** | store UUID | **DB schema + relations + rollups** | internal, not portable | internal | yes (DB query) | **lossy** | none |
|
||||
| **Wikibase / Wikidata** | MediaWiki + RDF index | external-API + **SPARQL** | **opaque Q/P id + stmt GUID** | **typed entity-statement graph** | page revisions (JSON) | last-writer + **rank** | **SPARQL (+SERVICE)** | **lossy (not MD)** | none |
|
||||
| **Anytype** | **CRDT (any-sync)** | **replica / P2P node** | object id (store) | **typed object graph (ontology)** | CRDT log | **native-CRDT** | graph | lossy | **E2EE (whole)** |
|
||||
| **AFFiNE** | **CRDT (Yjs)** | replica / self-host | block id (store) | blocks; **one-data-many-views** | CRDT log | **native-CRDT** | DB filters | lossy | optional |
|
||||
| **AppFlowy** | **CRDT (Yrs)** | replica / self-host | block id (store) | **Notion-style DB + views** | CRDT log | **native-CRDT** | DB query | lossy | self-host |
|
||||
| **TiddlyWiki** | single file | file-store | path | typed tiddlers | none | git/text | no | varies | none |
|
||||
| **MediaWiki / Confluence** | DB | API | path | wikitext / macros | internal-only | internal | limited | lossy | none |
|
||||
| — *Federated Wiki (mostly a federation model, §2.5)* | per-page JSON | **REST/file-store hybrid** | story-item `id` | typed story items | **append-only journal** | **fork + journal-replay** | neighborhood search | wiki-ish | none |
|
||||
| — *Xanadu (ideal)* | permascroll | — | **tumbler (span)** | spans + links | permanent | content-merge | — | — | — |
|
||||
| — *ZigZag (ideal)* | cells/dims | — | cell id | **N dimensions** | — | — | dimension walk | — | — |
|
||||
|
||||
Reading top to bottom is roughly shard-wiki's difficulty gradient: **Git/Obsidian/
|
||||
Logseq** are friction-free file-store cases (Logseq adding block addressing on files);
|
||||
**TWiki/Foswiki/Trilium** add translation and (Trilium) a DAG + computed metadata;
|
||||
**XWiki/Roam/Notion** add DB structure and store addressing; the **CRDT cohort
|
||||
(Anytype/AFFiNE/AppFlowy)** adds a new merge model and (Anytype) P2P+E2EE; **Notion** is
|
||||
the hardest hosted case. Xanadu/ZigZag mark the ideals.
|
||||
Reading top to bottom is roughly shard-wiki's difficulty gradient, now with both ends
|
||||
extended: the **flat-file floor (Oddmuse/UseModWiki)** anchors the bottom — minimal, partial
|
||||
history, the field's common ancestor; **git/forge wikis, ikiwiki, Obsidian, Logseq,
|
||||
TiddlyWiki** are friction-free file-store cases (forge wikis make *git the store itself*);
|
||||
**TWiki/Foswiki/Trilium/Wiki.js** add translation, DAG/computed metadata, or a git mirror;
|
||||
**MojoMojo** needs direct-DB reads; **XWiki/Roam/Notion/Quip** add DB/SaaS structure and
|
||||
store/API addressing; **Wikibase** is the **typed-knowledge-graph far-end** (SPARQL,
|
||||
statement-level provenance); the **CRDT cohort** adds native merge + (Anytype) P2P/E2EE.
|
||||
Xanadu/ZigZag mark the ideals; Federated Wiki marks the federation-model ideal (§2.5).
|
||||
|
||||
---
|
||||
|
||||
## 2. The capability spectra (the contract's real shape) — thirteen
|
||||
## 2. The capability spectra (the contract's real shape) — fourteen
|
||||
|
||||
Each capability is **not boolean** — it is a position on a spectrum anchored at each end
|
||||
by a real system. The contract should model *positions*; federation ops degrade by
|
||||
position.
|
||||
Each capability is **not boolean** — it is a position on a spectrum anchored at each end by
|
||||
a real system. The contract models *positions*; federation ops degrade by position.
|
||||
|
||||
1. **Addressing granularity** — `none → whole-page(path) → page-level store id(Joplin
|
||||
:/id, Trilium noteId) → in-file span(Obsidian ^id) → in-file block(Logseq id::, the
|
||||
sweet spot: block-level AND git-diffable) → store-minted span(Roam/Notion/CRDT UUID) →
|
||||
portable tumbler(Xanadu ideal)`. (UC-51, UC-44/45.)
|
||||
2. **Content identity** — `none → path/title → fingerprint(hash) → span-set/equivalence
|
||||
(Xanadu)`. Drives equivalence + reverse-transclusion. (UC-46, UC-27.)
|
||||
3. **Identity vs placement** *(new emphasis)* — `path = identity(most) → identity
|
||||
separated from placement(Trilium note/branch; a page in many locations = a DAG)`. The
|
||||
clean model for a page under multiple paths/shards. (UC-66, UC-22.)
|
||||
4. **Structure** — `flat Markdown → in-file frontmatter/key::(Obsidian/Logseq) → in-file
|
||||
%META%(TWiki) → typed objects(XWiki) → DB schema+relations+rollups(Notion/AppFlowy) →
|
||||
typed object graph/ontology(Anytype) → computed: inherited+templated(Trilium)`.
|
||||
In-text federates; DB-locked needs sidecar+fidelity; **computed** needs effective-vs-
|
||||
own provenance. (UC-34, UC-39, UC-58, UC-67.)
|
||||
5. **History** — `none → internal-only/not-portable(Notion/Joplin/Trilium) → CRDT update
|
||||
log(Anytype/AFFiNE/AppFlowy) → open file format(TWiki RCS) → git-native(Git/Obsidian/
|
||||
Logseq)`. Internal/CRDT-log ⇒ *supplement*; open-file ⇒ *import*; git ⇒ *adopt*.
|
||||
(UC-36, UC-41.)
|
||||
6. **Merge model** *(new, 13th)* — `none → git/text merge → conflict-notes/keep-both
|
||||
(Joplin) → native-CRDT conflict-free(Anytype/AFFiNE/AppFlowy)`. A CRDT shard merges
|
||||
itself — **never impose git/text merge**; speak the CRDT (replica) or stay projection/
|
||||
overlay. (UC-64.)
|
||||
7. **Native query** — `none → text search → build-your-own derived index(Logseq DataScript
|
||||
over files; shard-wiki can do likewise) → datalog/graph(Roam/Anytype) → DB query
|
||||
(Notion/AppFlowy/XWiki)`. Delegate where present; **build an index over the projection**
|
||||
where not. (UC-52, UC-63, UC-05, UC-54.)
|
||||
1. **Addressing granularity** — `none → whole-page(path) → page-level store id(Joplin :/id,
|
||||
Trilium noteId) → in-file span(Obsidian ^id) → in-file block(Logseq id::, the sweet spot:
|
||||
block-level AND git-diffable) → store-minted span(Roam/Notion/CRDT UUID) → statement GUID
|
||||
(Wikibase) → portable tumbler(Xanadu ideal)`. Story-item `id` (Federated Wiki) is a
|
||||
mid-tier within-page handle. (UC-51, UC-44/45, UC-73.)
|
||||
2. **Content identity** — `none → path/title → opaque stable id, labels-as-annotation
|
||||
(Wikibase Q/P) → fingerprint(hash) → span-set/equivalence(Xanadu)`. Wikibase is the
|
||||
cleanest real instance of **stable, language-neutral identity**. (UC-46, UC-27, UC-73.)
|
||||
3. **Identity vs placement** — `path = identity(most) → identity separated from placement
|
||||
(Trilium note/branch; a page in many locations = a DAG) → provenance-edge links across
|
||||
sites (Federated Wiki fork-DAG; forge wiki = path, identity layered above)`. The clean
|
||||
model for a page under multiple paths/shards. (UC-66, UC-22, UC-71.)
|
||||
4. **Structure** — `flat Markdown(Oddmuse) → in-file frontmatter/key::(Obsidian/Logseq) →
|
||||
in-file %META%(TWiki) → tiddler fields(TiddlyWiki) → relational rows(MojoMojo) → typed
|
||||
objects(XWiki) → DB schema+relations+rollups(Notion/AppFlowy) → prose+inline-embedded
|
||||
objects(Quip) → typed object graph/ontology(Anytype) → computed inherited+templated
|
||||
(Trilium) → typed entity-statement knowledge graph(Wikibase)`. In-text federates;
|
||||
DB-locked needs sidecar+fidelity; **graph/computed** needs effective-vs-own +
|
||||
render-without-flatten. (UC-34, UC-39, UC-58, UC-67, UC-73, UC-80.)
|
||||
5. **History** — `none → partial/truncated flat-file(Oddmuse keep/) → internal-only/
|
||||
not-portable(Notion/Joplin/Trilium) → DB version rows(MojoMojo) → CRDT update log
|
||||
(Anytype/AFFiNE/AppFlowy) → append-only semantic journal(Federated Wiki) → open file
|
||||
format(TWiki RCS) → git-native(Git/forge/ikiwiki/Obsidian/Logseq/Wiki.js mirror)`.
|
||||
Internal/CRDT/DB-rows ⇒ *supplement*; open-file/journal ⇒ *import*; git ⇒ *adopt*.
|
||||
**Completeness is metadata** (Oddmuse history is partial — never imply complete). (UC-36,
|
||||
UC-41, UC-71, UC-81, UC-82.)
|
||||
6. **Merge model** — `none → last-writer → git/text 3-way merge → conflict-notes/keep-both
|
||||
(Joplin) → fork + manual journal-replay(Federated Wiki) → coexist-with-rank(Wikibase:
|
||||
contradictory values kept, curated) → native-CRDT conflict-free(Anytype/AFFiNE/AppFlowy)`.
|
||||
Four+ distinct models — never impose git/text merge on a CRDT or a journal shard; speak
|
||||
the shard's model or stay projection/overlay. (UC-64, UC-71, UC-75.)
|
||||
7. **Native query** — `none → text search → filter expressions(TiddlyWiki) → build-your-own
|
||||
derived index(Logseq DataScript over files; shard-wiki can do likewise) → datalog/graph
|
||||
(Roam/Anytype) → DB query(Notion/AppFlowy/XWiki) → SPARQL/RDF + federated SERVICE
|
||||
(Wikibase, the far-end + query-time cross-shard join)`. Delegate where present; **build an
|
||||
index over the projection** where not. (UC-52, UC-63, UC-05, UC-54, UC-74.)
|
||||
8. **Translation to Markdown** — `native → lossless round-trip(Foswiki TML↔HTML) → lossy-
|
||||
with-fidelity-report(HTML/CKEditor Trilium; Notion blocks; CRDT/object models)`.
|
||||
Lossless ⇒ writable; lossy ⇒ read/projection floor + visible fidelity loss. (UC-42,
|
||||
UC-59, UC-03.)
|
||||
9. **Attachment mode** *(expanded)* — a **per-binding, capability-gated** choice; a
|
||||
backend may offer several:
|
||||
- **file-store** — *native on-disk store*(Obsidian/Logseq/TWiki) **or** *interchange/
|
||||
sync mirror*(Joplin items on WebDAV/S3) (UC-40, UC-53, UC-60, UC-62)
|
||||
- **in-engine host** — adapter inside the app via its API (Roam/Obsidian/Logseq/
|
||||
Trilium scripting, XWiki components) (UC-38, UC-50)
|
||||
with-fidelity-report(HTML/CKEditor Trilium; Notion blocks; Quip HTML+embedded objects;
|
||||
CRDT/object models) → not-Markdown-at-all(Wikibase statements → lossy render or keep
|
||||
graph)`. Lossless ⇒ writable; lossy ⇒ read/projection floor + visible fidelity loss;
|
||||
not-MD ⇒ structured payload + optional rendered view. (UC-42, UC-59, UC-03, UC-73, UC-80.)
|
||||
9. **Attachment mode** — a **per-binding, capability-gated** choice; a backend may offer
|
||||
several. The full taxonomy:
|
||||
- **file-store** — *native on-disk store* (Obsidian/Logseq/TWiki), *git IS the store*
|
||||
(forge `.wiki.git`, ikiwiki source), *single-file container* (TiddlyWiki HTML),
|
||||
*flat-file floor* (Oddmuse), **or** *interchange/sync mirror* (Joplin; Wiki.js git
|
||||
mirror) (UC-40, UC-53, UC-60, UC-62, UC-76, UC-78, UC-79, UC-82, UC-68)
|
||||
- **in-engine host** — adapter inside the app via its API (Roam/Obsidian/Logseq/Trilium
|
||||
scripting, XWiki components) (UC-38, UC-50)
|
||||
- **local-REST** — localhost API, app-running (Joplin Data API; Trilium ETAPI) (UC-38)
|
||||
- **external-API** — remote REST from outside (Notion) (UC-57)
|
||||
- **external-API** — remote API from outside, with a **payload-format facet**:
|
||||
*block-JSON* (Notion, UC-57), *GraphQL* (Wiki.js, UC-69), *HTML* (Quip, UC-80),
|
||||
*forge wiki REST* (GitLab/Gitea, UC-77), *MediaWiki/SPARQL* (Wikibase, UC-73/74)
|
||||
- **direct-DB-read** — read the engine's relational store (MojoMojo) when no file/API
|
||||
exists; schema = a versioned coupling (UC-81)
|
||||
- **CRDT replica** — hold a local CRDT replica (Anytype/AFFiNE/AppFlowy) (UC-64)
|
||||
- **P2P / no-central-endpoint** — replica or peer/node, not a URL (Anytype) (UC-65)
|
||||
- **REST/file-store hybrid** — page JSON over HTTP+CORS or static files (Federated Wiki)
|
||||
(UC-70)
|
||||
10. **Operational envelope** — `local/unbounded → realtime CRDT/WebSocket → rate-limited+
|
||||
eventually-consistent+paginated(Notion ~3 rps)`. Sets live vs cache/poll/webhook.
|
||||
(UC-57, UC-31.)
|
||||
11. **Access grant** — `open(L0) → token → OAuth scoped+revocable(Notion) → P2P key/invite
|
||||
(Anytype)`. The backend may *enforce* no-silent-mutation. (UC-57, UC-65,
|
||||
[[shard-wiki-auth-in-core-decision]].)
|
||||
12. **Content opacity** *(new, 12th)* — `plaintext → encrypted-at-rest whole-shard(Joplin/
|
||||
Anytype E2EE) → per-item(Trilium protected notes — encrypted and plaintext coexist)`.
|
||||
Opaque content ⇒ backup/structure-shell only; never present ciphertext as readable.
|
||||
(UC-61.)
|
||||
13. **Write granularity** — `whole-file(TiddlyWiki) → per-page/note(Git/Obsidian/Joplin/
|
||||
Trilium) → per-block(Roam/Notion/Logseq/CRDT)`. Sets overlay/patch/lock/conflict
|
||||
scope. (UC-35.)
|
||||
eventually-consistent+paginated(Notion ~3 rps, Quip, Wikibase public endpoints)`. Sets
|
||||
live vs cache/poll/webhook. (UC-57, UC-31.)
|
||||
11. **Access grant** — `open(L0; Oddmuse) → token → OAuth scoped+revocable(Notion) →
|
||||
enterprise SSO + ACL(Quip/Salesforce, Wiki.js path rules) → P2P key/invite(Anytype) →
|
||||
own-site-only writes(Federated Wiki)`. The backend may *enforce* no-silent-mutation.
|
||||
(UC-57, UC-06, UC-65, [[shard-wiki-auth-in-core-decision]].)
|
||||
12. **Content opacity** — `plaintext(files) → proprietary-but-lossy-exportable(Quip HTML;
|
||||
Notion) → encrypted-at-rest whole-shard(Joplin/Anytype E2EE) → per-item(Trilium
|
||||
protected notes)`. Opaque ⇒ backup/structure-shell; never present ciphertext (or imply
|
||||
a lossy export is faithful) as readable. (UC-61, UC-80.)
|
||||
13. **Write granularity** — `whole-file(TiddlyWiki single-file) → per-page/note(Git/forge/
|
||||
Obsidian/Joplin/Trilium/Oddmuse) → story-item/paragraph(Federated Wiki) → section-anchor
|
||||
splice(Quip) → per-statement(Wikibase API) → per-block(Roam/Notion/Logseq/CRDT)`. Sets
|
||||
overlay/patch/lock/conflict scope; whole-file ⇒ **no per-page atomicity**. (UC-35,
|
||||
UC-78.)
|
||||
14. **Provenance granularity** *(new, 14th)* — `none → per-shard → per-page(most; author/
|
||||
time) → per-commit(git) → per-edit(journal entry, Federated Wiki) → per-statement/value
|
||||
(Wikibase references + rank)`. How finely the union can attribute and source content;
|
||||
Wikibase pushes provenance below the page (sourced, contradictory values coexist with a
|
||||
curation signal). The page model + journal should *allow* sub-page provenance even if MVP
|
||||
records per page. (UC-24, UC-71, UC-75.)
|
||||
|
||||
*(Content types — Markdown-only → typed records → non-Markdown assets (Excalidraw/Canvas/
|
||||
whiteboards/objects) — remains a cross-cutting page-model demand, tracked under structure
|
||||
+ T12 rather than as a standalone capability. UC-55.)*
|
||||
*(Content types — Markdown-only → typed records → inline-embedded objects (Quip
|
||||
spreadsheets) → non-Markdown assets (Excalidraw/Canvas/whiteboards) → typed-graph statements
|
||||
(Wikibase) — remains a cross-cutting page-model demand, tracked under structure + T12 rather
|
||||
than as a standalone capability. UC-55.)*
|
||||
|
||||
Design consequence: **T11's capability vocabulary = these thirteen spectra**, not a flat
|
||||
Design consequence: **T11's capability vocabulary = these fourteen spectra**, not a flat
|
||||
`read/write/diff/...` list. The flat verbs remain the *operations*; the spectra are the
|
||||
*profile* saying how well each verb is supported and how it degrades.
|
||||
*profile* saying how well each verb is supported and how it degrades. The **floor**
|
||||
(Oddmuse/UseModWiki) and **far-ends** (Wikibase graph/query/provenance; CRDT merge; Notion
|
||||
hosted) bound every spectrum with a real system.
|
||||
|
||||
---
|
||||
|
||||
## 2.5. The federation-model taxonomy (the coordination-layer axis) — new in v3
|
||||
|
||||
The fourteen spectra above describe a *single shard's* capabilities. The WP-0003 batch
|
||||
revealed a second, orthogonal axis the v2 synthesis under-modelled: **federation itself is
|
||||
plural.** "Attach many shards and present a union" can be realized by several distinct
|
||||
coordination models, each a real system, each with different reconciliation semantics. This
|
||||
axis lives at shard-wiki's **coordination layer** (`SHARD-WP-0002` T1–T6), not in a single
|
||||
adapter.
|
||||
|
||||
| Federation model | Exemplar | Mechanism | Reconciliation | Discovery |
|
||||
|------------------|----------|-----------|----------------|-----------|
|
||||
| **Fork + journal** | Federated Wiki | copy page to own site; append-only semantic **journal** records `fork`-with-source | **manual**: compare journals, fork the version you prefer; **chorus**, no canonical | link + fork (**neighborhood**) / curated **roster** |
|
||||
| **VCS replication + ping** | ikiwiki | git clone/pull/push between instances; **XML-RPC pinger** notifies peers to pull/rebuild | **git merge** across clones | configured peers + pings |
|
||||
| **Query-time graph join** | Wikibase | **SPARQL `SERVICE`** runs a sub-query on another endpoint and joins | none (read-time join; rank curates conflicts) | endpoint URLs |
|
||||
| **Feed aggregation** | ikiwiki `aggregate`, RSS/Atom | pull remote feeds in as pages | one-way inbound projection | feed URLs |
|
||||
| **Activity streams** | ActivityPub (federation research) | actor/inbox/outbox Create/Update | per-actor; eventual | actor handles |
|
||||
| **Engine-maintained mirror** | Wiki.js git mirror | DB-canonical engine syncs to a git mirror | engine owns DB↔git sync (don't double-sync) | the mirror repo |
|
||||
|
||||
Two cross-cutting lessons:
|
||||
|
||||
- **git-IS-store vs engine-mirror resolves the write-race.** A **forge wiki** (`.wiki.git`)
|
||||
and **ikiwiki source** make *git the canonical store*, so **write-by-commit is safe** — no
|
||||
engine to race. Wiki.js (engine-mirror, DB canonical) is the opposite and needs care. The
|
||||
contract must record *which side is source of truth* per binding. (Resolves UC-68's open
|
||||
race for the git-canonical case; UC-76.)
|
||||
- **shard-wiki's own coordination journal should be journal-shaped.** Federated Wiki proves
|
||||
the **append-only semantic-op log with provenance entries, page-state-as-derived-replay**
|
||||
pattern in production — the concrete shape for INTENT's coordination journal (UC-71), and a
|
||||
superset that can *ingest* git history, CRDT logs, DB version rows, and partial flat-file
|
||||
histories as differently-grained inputs.
|
||||
|
||||
Design consequence: **T1–T6 should model federation as a selectable model (or composition of
|
||||
models), not a single hard-coded flow** — mechanism over policy at the coordination layer,
|
||||
mirroring how T11 models per-shard capability as spectra. A given information space may use
|
||||
fork+journal for human-curated shards, VCS-replication for git shards, query-join for graph
|
||||
shards, and feed-aggregation for read-only sources — concurrently.
|
||||
|
||||
---
|
||||
|
||||
## 3. Cross-cutting findings (the through-lines)
|
||||
|
||||
- **Files-canonical, index-derived is the winning architecture.** Obsidian's
|
||||
MetadataCache, **Logseq's DataScript-over-files**, Git's working tree, and shard-wiki's
|
||||
projection model agree: the graph/backlinks/structured index are **derived and
|
||||
rebuildable**, never a second source of truth. Roam/Notion/CRDT invert this (store
|
||||
canonical) and pay in portability. shard-wiki keeps files+journal canonical.
|
||||
- **Fine-grained addressing is real, adoptable, and can be git-diffable.** Roam/Notion/
|
||||
CRDT mint store UUIDs; Obsidian uses in-file `^id`; **Logseq resolves the tension —
|
||||
block-level `id::` that is also git-diffable text**. Adopt native IDs where present,
|
||||
fingerprint where not; portable span address stays the open ideal (Xanadu).
|
||||
- **Identity ≠ placement.** *(new)* Trilium's **note vs branch** (a note cloned into many
|
||||
locations = a DAG) is the clean model for a page under multiple paths or across shards —
|
||||
separate *what a page is* from *where it sits*. The namespace-level form of the clone/
|
||||
reference primitive.
|
||||
- **Transclusion ⇄ clone ⇄ embed ⇄ cloned-note is one primitive.** Xanadu transclusion,
|
||||
ZigZag clone, Roam/Logseq embed, Obsidian `![[ ]]`, Notion synced block, **Trilium note
|
||||
cloning** — one "reference-not-copy" primitive over an addressable union (UC-32/44/45/
|
||||
51/66).
|
||||
- **CRDT changes the merge math.** *(new)* The CRDT cohort merges concurrent edits itself;
|
||||
shard-wiki must **not impose git/text merge** — speak the CRDT (replica) or stay a
|
||||
projection/overlay. A new **merge-model** capability (spectrum 6).
|
||||
- **Structure & history federate iff in text; metadata can be computed.** `%META%`/
|
||||
frontmatter/`key::` diff and travel; XObjects/Notion-DB/CRDT lock in. *(new)* Trilium
|
||||
adds **computed metadata** (inherited+templated) — represent effective-vs-own, don't
|
||||
flatten.
|
||||
- **Content opacity is per-item, not only whole-shard.** *(new)* Joplin/Anytype encrypt
|
||||
whole-shard; **Trilium encrypts per-note** — the opacity capability must be granular.
|
||||
- **The attach surface is not always the native store.** *(new)* Joplin's best surface is
|
||||
its **sync mirror** (not its SQLite); Logseq offers file-graph **or** DB-graph, and is
|
||||
**migrating substrate** (file→SQLite). Bind to capabilities, not to "it's files."
|
||||
- **The page model must stretch many ways at once:** prose Markdown, typed/computed
|
||||
records (N-per-page, relations, inheritance), non-Markdown assets, reference/query-
|
||||
defined pages, and **multi-placement (DAG) identity**. The heaviest demand on T12.
|
||||
- **Files-canonical, index-derived is the winning architecture.** Obsidian's MetadataCache,
|
||||
**Logseq's DataScript-over-files**, Git's working tree, **Wikibase's WDQS (SPARQL index
|
||||
rebuilt from canonical JSON entities)**, and shard-wiki's projection model agree: the
|
||||
graph/backlinks/query index is **derived and rebuildable**, never a second source of truth.
|
||||
Roam/Notion/CRDT invert this (store canonical) and pay in portability. shard-wiki keeps
|
||||
files + journal canonical.
|
||||
- **Git is both the home store and the home journal.** *(sharpened in v3)* Forge wikis make
|
||||
git *the* store; ikiwiki and Wiki.js make git the source or mirror; for all of them the
|
||||
**git log is the coordination journal with zero synthesis**. The git-canonical cases are
|
||||
the friction-free core; everything else is measured as deviation.
|
||||
- **The flat-file floor is the field's common root.** *(new)* Oddmuse and UseModWiki
|
||||
(Wikipedia's MediaWiki Phase I) show the minimal plain-text page+history wiki is the
|
||||
ancestor every richer engine elaborates — so the **minimal/floor capability profile** is
|
||||
the right baseline, and shard-wiki's page model must stay attach-compatible with it
|
||||
(flat files, CamelCase identities, **partial** history). (UC-82.)
|
||||
- **Federation is plural** *(new — §2.5).* Fork+journal, VCS-replication+ping, query-time
|
||||
graph join, feed aggregation, activity streams, engine-mirror — distinct coordination
|
||||
models, selectable and composable, not one flow.
|
||||
- **Provenance has a granularity spectrum** *(new — spectrum 14).* From per-shard down to
|
||||
**per-statement/value** (Wikibase references + rank). Union-without-erasure includes
|
||||
*attribution at the right grain* and letting sourced contradictions coexist with curation.
|
||||
- **Identity ≠ placement.** Trilium's **note vs branch** (a note cloned into many locations =
|
||||
a DAG) and Wikibase's **opaque stable IDs (labels-as-annotation)** are the clean models:
|
||||
separate *what a page is* from *where it sits* and from *what it's called*. Federated
|
||||
Wiki's fork entries are **provenance edges** between same-named pages across sites.
|
||||
- **Transclusion ⇄ clone ⇄ embed ⇄ cloned-note ⇄ reference is one primitive** over an
|
||||
addressable union (Xanadu, ZigZag, Roam/Logseq, Obsidian `![[ ]]`, Notion synced block,
|
||||
Trilium cloning, TiddlyWiki `{{ }}`). (UC-32/44/45/51/66.)
|
||||
- **CRDT changes the merge math; journals and rank add more models.** *(extended)* The merge
|
||||
spectrum now spans last-writer → git 3-way → conflict-notes → **fork+journal-replay** →
|
||||
**coexist-with-rank** → native-CRDT. Never impose git/text merge across that range.
|
||||
- **Structure & history federate iff in text; metadata can be computed; the far-end is a
|
||||
graph.** `%META%`/frontmatter/`key::`/tiddler-fields diff and travel; XObjects/Notion-DB/
|
||||
CRDT/Quip-objects lock in; Trilium computes metadata (inherited+templated); **Wikibase is
|
||||
a full typed knowledge graph** — render to Markdown is lossy, so keep the graph and offer a
|
||||
view, never silent-flatten.
|
||||
- **The attach surface is rarely "just files," and source-of-truth varies.** *(extended)*
|
||||
Joplin's best surface is its sync mirror; Logseq offers file-graph or DB-graph and is
|
||||
migrating substrate; **forge wikis = git-canonical (write freely)** vs **Wiki.js =
|
||||
DB-canonical mirror (write carefully)** vs **MojoMojo = DB-only (direct read)**. Bind to
|
||||
capabilities *and* record the source of truth.
|
||||
- **The page model must stretch many ways at once:** prose Markdown, typed/computed records
|
||||
(N-per-page, relations, inheritance), **inline-embedded objects** (Quip), **typed-graph
|
||||
statements** (Wikibase), non-Markdown assets, reference/query-defined pages, and
|
||||
**multi-placement (DAG) identity**. The heaviest demand on T12.
|
||||
|
||||
---
|
||||
|
||||
## 4. How the post-engine use cases fold into the workplan
|
||||
## 4. How the use cases fold into the workplan
|
||||
|
||||
UC-44–UC-67 (from the 260614 dives) map onto the adapter-contract tasks:
|
||||
The 260614 + WP-0003 use cases (UC-44–UC-82) map onto the adapter-contract and federation
|
||||
tasks:
|
||||
|
||||
| UCs | Theme | Lands in |
|
||||
|-----|-------|----------|
|
||||
| UC-35, UC-50, UC-53, UC-57, **UC-60, UC-62, UC-64, UC-65** | attachment modes (file-store native/mirror, in-engine, local-REST, external-API, CRDT-replica, P2P) + operational envelope + merge model | **T11** + **T14** |
|
||||
| UC-34, UC-39, UC-55, UC-58, **UC-67** | structured/typed/**computed** payload, non-MD assets, DB schema+relations | **T12** |
|
||||
| UC-36 | internal/CRDT-log history = supplement | **T13** |
|
||||
| UC-42, UC-59 | translation: lossless vs lossy-with-fidelity-report (HTML/blocks/CRDT) | **T15** |
|
||||
| UC-31 | webhooks / realtime / push-vs-poll | **T6** / **T11** envelope |
|
||||
| UC-57 §6, **UC-61, UC-65** | scoped grant; **content opacity** (whole/ per-item); P2P key | **T11** (access-grant + opacity) |
|
||||
| UC-44, UC-45, UC-46, UC-51, **UC-63, UC-66** | span addressing, content identity, **identity≠placement**, transclusion-as-reference, derived index | **T16** (+ T12) |
|
||||
| UC-47, UC-48, UC-52, UC-54 | dimensional navigation, query delegation/build, query-defined pages | **T16** (+ T5/T10) |
|
||||
| UC-35, UC-50, UC-53, UC-57, UC-60, UC-62, UC-64, UC-65, **UC-68, UC-70, UC-76, UC-77, UC-78, UC-79, UC-80, UC-81, UC-82** | attachment modes (file-store native/git-IS-store/container/mirror, in-engine, local-REST, external-API w/ payload-format, direct-DB, CRDT-replica, P2P, REST/file-store-hybrid) + operational envelope | **T11** + **T14** |
|
||||
| UC-34, UC-39, UC-55, UC-58, UC-67, **UC-73, UC-80** | structured/typed/computed/**graph** payload, **inline-embedded objects**, non-MD assets | **T12** |
|
||||
| UC-36, UC-41, **UC-81, UC-82** | history: internal/CRDT-log/**DB-rows**/**partial-flat-file** = supplement; **completeness metadata** | **T13** |
|
||||
| UC-42, UC-59, **UC-73, UC-80** | translation: lossless vs lossy-with-fidelity vs **not-Markdown** (graph/HTML/objects) | **T15** |
|
||||
| UC-31, **UC-79** | webhooks / realtime / push-vs-poll / **VCS ping** | **T6** / **T11** envelope |
|
||||
| UC-57 §6, UC-61, UC-65, **UC-06, UC-80** | scoped grant; **enterprise ACL/SSO**; content opacity (whole/per-item/**lossy-exportable**); P2P key | **T11** (access-grant + opacity) |
|
||||
| UC-44, UC-45, UC-46, UC-51, UC-63, UC-66, **UC-73, UC-74** | span/statement addressing, **opaque stable identity**, identity≠placement, transclusion-as-reference, derived index, **graph query** | **T16** (+ T12) |
|
||||
| UC-47, UC-48, UC-52, UC-54, **UC-74** | dimensional navigation, query delegation/build, query-defined pages, **SPARQL/federated SERVICE** | **T16** (+ T5/T10) |
|
||||
| UC-24, **UC-71, UC-75** | **provenance granularity** (per-edit / per-statement); coordination journal shape | **T13/T16** + journal |
|
||||
| UC-26, UC-27, UC-28, UC-30, UC-05, **UC-71, UC-72, UC-79** | **federation-model taxonomy** (fork+journal, VCS-replication+ping, query-join, chorus/neighborhood/roster) | **T1–T6** |
|
||||
|
||||
No new task needed — **T16** already owns the addressing/identity/navigation thread; v2
|
||||
adds **identity≠placement (UC-66)** and **build-your-own derived index (UC-63)** to it,
|
||||
and **computed metadata (UC-67)** + **non-MD/multi-placement** to **T12**.
|
||||
The one **structural** v3 change: the **federation-model taxonomy (§2.5)** is a new design
|
||||
surface for **T1–T6** — federation becomes a *selectable, composable model* rather than a
|
||||
single flow. Per-shard, **T11** grows to **fourteen spectra** (adds provenance granularity)
|
||||
and **T14** absorbs the expanded attachment-mode taxonomy (git-IS-store, container,
|
||||
direct-DB, REST/file-store-hybrid, external-API payload-format facet).
|
||||
|
||||
---
|
||||
|
||||
## 5. Recommendations (decisions to record under SHARD-WP-0002)
|
||||
|
||||
1. **Model capabilities as the thirteen spectra (§2)**, not flat verbs. (T11.)
|
||||
2. **Two new spectra: content opacity (per-item) and merge model.** A CRDT shard's merge
|
||||
is native — never git-merge it; an encrypted shard degrades to structure-shell. (T11.)
|
||||
3. **Adopt native span IDs (prefer in-file/git-diffable, à la Logseq `id::`); fingerprint
|
||||
otherwise; keep portable-span-address open.** (T16/T11, UC-51/46.)
|
||||
4. **Separate page identity from placement** (Trilium note/branch) in the page/namespace
|
||||
model — a page is one entity with N placements (paths/shards). (T12/T16, UC-66.)
|
||||
5. **One reference-not-copy primitive** for transclusion/clone/embed/cloned-note over the
|
||||
addressable union. (T16, UC-32/44/45/66.)
|
||||
6. **Prefer in-text structure; tolerate DB structure via sidecar+fidelity; represent
|
||||
computed (inherited/templated) metadata as effective-vs-own.** (T12/T15, UC-34/58/67.)
|
||||
7. **Attachment mode is a per-binding, capability-gated choice** across the full set:
|
||||
file-store (native *or* interchange-mirror), in-engine-host, local-REST, external-API,
|
||||
CRDT-replica, P2P/no-central-endpoint. Bind to capabilities — substrate can migrate
|
||||
(Logseq). (T14, UC-40/38/50/57/60/62/64/65 + UC-43.)
|
||||
8. **Query: delegate to a native engine where present; build a derived index over the
|
||||
projection where not** (Logseq pattern). (T16/T5, UC-52/63/54.)
|
||||
9. **Keep files + coordination journal canonical; all indexes/projections derived and
|
||||
rebuildable.** (Architecture invariant, INTENT.)
|
||||
1. **Model per-shard capabilities as the fourteen spectra (§2)**, not flat verbs. (T11.)
|
||||
2. **Model federation as a selectable/composable taxonomy of models (§2.5)** — fork+journal,
|
||||
VCS-replication+ping, query-time graph join, feed aggregation, activity streams,
|
||||
engine-mirror — at the coordination layer. (T1–T6.)
|
||||
3. **Make the coordination journal journal-shaped** (Federated Wiki): append-only semantic
|
||||
ops + provenance entries, page state = derived replay; able to *ingest* git history, CRDT
|
||||
logs, DB version rows, and partial flat-file histories as differently-grained inputs.
|
||||
(T13 + journal, UC-71.)
|
||||
4. **Add provenance granularity as a spectrum** (per-shard → per-page → per-edit →
|
||||
per-statement/value); allow sub-page provenance and **coexist-with-rank** for sourced
|
||||
contradictions. (T11/T13, UC-24/75.)
|
||||
5. **Record source-of-truth per binding** and resolve the write-race accordingly:
|
||||
git-canonical (forge wiki / ikiwiki) ⇒ write-by-commit; engine-mirror (Wiki.js) ⇒ careful;
|
||||
direct-DB (MojoMojo) ⇒ read/projection/overlay default. (T14, UC-68/76/79/81.)
|
||||
6. **Expand attachment mode (§2 #9 / T14):** file-store (native / **git-IS-store** /
|
||||
**container** / mirror / **flat-file floor**), in-engine, local-REST, **external-API with
|
||||
a payload-format facet** (block-JSON/GraphQL/HTML/forge-REST/SPARQL), **direct-DB-read**,
|
||||
CRDT-replica, P2P, **REST/file-store-hybrid**. Bind to capabilities; substrate can
|
||||
migrate. (T14, UC-40/68/70/76/77/78/79/80/81/82 + UC-43.)
|
||||
7. **Page model: support a typed-graph payload and inline-embedded objects, not only
|
||||
typed records** — render to Markdown lossily, keep the graph/objects canonical; preserve
|
||||
computed (inherited/templated) metadata as effective-vs-own. (T12/T15, UC-73/80/67.)
|
||||
8. **Capability profile must express absence and partiality cleanly** (Oddmuse floor):
|
||||
sparse profiles, **partial/truncated history reported honestly**, graceful degradation to
|
||||
read/projection/overlay/backup. (T11/T13, UC-82.)
|
||||
9. **Query: SPARQL/RDF + federated `SERVICE` is the graph far-end** — delegate to native
|
||||
engines (filters/datalog/DB-query/SPARQL) where present; build a derived index over the
|
||||
projection where not. (T16/T5, UC-52/63/74.)
|
||||
10. **Adopt opaque stable identity with labels-as-annotation** (Wikibase) and **separate
|
||||
identity from placement** (Trilium note/branch); fork entries are provenance edges.
|
||||
(T16, UC-73/66/71.)
|
||||
11. **Keep files + coordination journal canonical; all indexes/projections derived and
|
||||
rebuildable.** (Architecture invariant, INTENT.)
|
||||
|
||||
These honor INTENT: mechanism over policy (spectra not hard-coded behaviors), union
|
||||
without erasure (fidelity + effective-vs-own + identity/placement all preserve
|
||||
information), graceful degradation (every capability has a read/projection floor; opaque/
|
||||
CRDT shards degrade to backup/projection), no silent mutation (access-grant + opacity +
|
||||
overlay + respect native merge), shard sovereignty (no backend forced to change
|
||||
substrate; substrate may migrate), Markdown-first/backend-neutral (in-text preferred, DB/
|
||||
CRDT/HTML tolerated).
|
||||
These honor INTENT: mechanism over policy (capability *and now federation* modelled as
|
||||
spectra/taxonomy, not hard-coded behaviors), union without erasure (fidelity +
|
||||
effective-vs-own + identity/placement + **provenance granularity** all preserve
|
||||
information), graceful degradation (every spectrum has a read/projection floor; the
|
||||
Oddmuse floor is explicit; opaque/CRDT/graph shards degrade to backup/projection/lossy-view),
|
||||
no silent mutation (access-grant + opacity + overlay + respect native merge + **source-of-
|
||||
truth per binding**), shard sovereignty (no backend forced to change substrate),
|
||||
Markdown-first/backend-neutral (in-text preferred; DB/CRDT/HTML/**graph** tolerated with a
|
||||
view).
|
||||
|
||||
---
|
||||
|
||||
## 6. Open questions escalated by the synthesis
|
||||
|
||||
1. **Portable span address** across heterogeneous backends — wrap native IDs (Roam/Notion/
|
||||
CRDT UUID, Logseq `id::`, Trilium noteId) in a shard-scoped address? (T16.)
|
||||
2. **CRDT shards** — embed a CRDT client (Yjs/Yrs) for a live replica, or consume
|
||||
snapshots? Overlays as CRDT ops or out-of-band patches? (T14, UC-64.)
|
||||
3. **Identity vs placement** — does shard-wiki adopt the note/branch split as its own page
|
||||
model, and is a cloned page one page with N placements or a transclusion? (T12/T16,
|
||||
UC-66.)
|
||||
4. **Computed metadata** — materialize effective attributes (snapshot) or compute live
|
||||
from the shard's tree/templates, with per-attribute provenance? (T12, UC-67.)
|
||||
5. **Page model breadth** — can one model carry prose + typed/computed records + non-MD
|
||||
assets + query-defined pages + multi-placement identity coherently? (T12.)
|
||||
6. **External-API write-through at scale** (Notion ~3 rps) vs read/projection default.
|
||||
(T14, UC-57.)
|
||||
7. **Content opacity** — what is visible for an opaque shard/item (IDs/structure vs
|
||||
nothing); does shard-wiki ever hold keys? (T11, UC-61.)
|
||||
1. **Federation composition** — can one information space run several federation models
|
||||
(§2.5) concurrently over different shards, and how does the union reconcile a chorus
|
||||
(fork+journal) with a canonical-asserting shard (Notion / upstream main)? (T1–T6,
|
||||
UC-72.)
|
||||
2. **Coordination-journal op vocabulary** — adopt Federated Wiki's exact ops
|
||||
(create/add/edit/move/remove/fork) at item grain, or an abstract op set other shards can
|
||||
emit, with git/CRDT/DB-row/flat-file histories ingested as inputs? (T13, UC-71.)
|
||||
3. **Provenance granularity in the model** — does the journal/page model carry per-statement
|
||||
provenance + rank (Wikibase), per-edit (journal), or per-page (MVP), configurable? (T13,
|
||||
UC-75.)
|
||||
4. **Typed-graph page** — model a Wikibase entity natively (statements) or always project to
|
||||
a lossy Markdown/table view, or both (canonical graph + view)? Is SPARQL a union-level
|
||||
capability or pass-through? (T12/T16, UC-73/74.)
|
||||
5. **Source-of-truth + write-race** — formalize git-canonical vs engine-mirror vs direct-DB
|
||||
per binding; sanction direct third-party DB reads (schema drift) and write-by-commit
|
||||
timing. (T14, UC-68/76/79/81.)
|
||||
6. **Portable span address** across heterogeneous backends — wrap native IDs (Roam/Notion/
|
||||
CRDT UUID, Logseq `id::`, Trilium noteId, **Wikibase Q/P + stmt GUID**, fedwiki story-item
|
||||
id) in a shard-scoped address? (T16.)
|
||||
7. **CRDT shards** — embed a CRDT client (Yjs/Yrs) for a live replica, or consume snapshots?
|
||||
Overlays as CRDT ops or out-of-band patches? (T14, UC-64.)
|
||||
8. **Page model breadth** — can one model carry prose + typed/computed records +
|
||||
inline-embedded objects + typed-graph statements + non-MD assets + query-defined pages +
|
||||
multi-placement identity coherently? (T12.)
|
||||
9. **Whole-file & partial-history shards** — overlays against a whole-file shard
|
||||
(TiddlyWiki single-file: buffer+re-serialize vs require `.tid`); representing
|
||||
partial/truncated history (Oddmuse) with completeness metadata. (T11/T13, UC-78/82.)
|
||||
10. **Content opacity & lossy-exportable** — what is visible for an opaque/proprietary shard
|
||||
(IDs/structure vs nothing vs lossy HTML); never present a lossy export as faithful; does
|
||||
shard-wiki ever hold keys? (T11, UC-61/80.)
|
||||
|
||||
---
|
||||
|
||||
## 7. Sources
|
||||
|
||||
A synthesis; primary sources are the fourteen dives' `findings.md` files plus
|
||||
A synthesis; primary sources are the ~23 dives' `findings.md` files plus
|
||||
`research/260608-{federation-concepts,wikiengines-overview,c2-wiki-origins,yawex-prior-art}`.
|
||||
No new external research was performed in v2.
|
||||
No new external research was performed in v3 (it consolidates the WP-0003 batch dives
|
||||
260614-{federated-wiki,wikibase,forge-wikis,tiddlywiki,ikiwiki,quip,mojomojo,oddmuse,
|
||||
usemodwiki}).
|
||||
|
||||
Cross-references: `spec/UseCaseCatalog.md` (UC-26–UC-67),
|
||||
Cross-references: `spec/UseCaseCatalog.md` (UC-26–UC-82),
|
||||
`workplans/SHARD-WP-0002-federation-architecture.md` (T1–T16),
|
||||
`workplans/SHARD-WP-0003-engine-dives-batch.md` (the engine batch, done),
|
||||
`INTENT.md` (constraints), [[shard-wiki-auth-in-core-decision]].
|
||||
|
||||
---
|
||||
|
||||
## 8. Traceability
|
||||
|
||||
- Consolidates: all fourteen deep dives + federation/origin research into one capability
|
||||
model (v2).
|
||||
- Feeds: `SHARD-WP-0002` T11 (**thirteen-spectra** vocabulary, incl. content-opacity +
|
||||
merge-model), T12 (page-model breadth: computed metadata, identity≠placement, non-MD),
|
||||
T13 (history incl. CRDT-log = supplement), T14 (full attachment-mode taxonomy incl.
|
||||
CRDT-replica + P2P + interchange-mirror + local-REST), T15 (lossy incl. HTML), T16
|
||||
(addressing, content identity, identity≠placement, derived index, dimensional/query).
|
||||
- UC coverage extended in the workplan from UC-34–UC-59 to **UC-34–UC-67**.
|
||||
- No UCs added (synthesis only); no boundary changes (INTENT Stability Note untouched).
|
||||
</content>
|
||||
- Consolidates: all ~23 deep dives + federation/origin research into one capability model
|
||||
(v3 adds the WP-0003 batch).
|
||||
- Feeds: `SHARD-WP-0002` **T1–T6** (federation-model taxonomy §2.5 — selectable/composable
|
||||
federation), **T11** (**fourteen-spectra** vocabulary, incl. provenance granularity +
|
||||
expanded attachment modes + payload-format facet), T12 (page-model breadth: typed-graph,
|
||||
inline-embedded objects, computed metadata, identity≠placement, non-MD), T13 (history incl.
|
||||
DB-rows + partial-flat-file = supplement, completeness metadata, journal-shaped
|
||||
coordination journal), T14 (full attachment taxonomy incl. git-IS-store / container /
|
||||
direct-DB / REST-file-hybrid + source-of-truth per binding), T15 (lossy + not-Markdown
|
||||
graph/HTML), T16 (addressing incl. statement GUID + opaque stable identity, graph query /
|
||||
federated SERVICE).
|
||||
- UC coverage extended in the workplan from UC-34–UC-67 to **UC-34–UC-82**.
|
||||
- No UCs added (synthesis only); no boundary changes (INTENT Stability Note untouched —
|
||||
the federation-model taxonomy is a refinement of *how* the coordination layer works, not a
|
||||
redefinition of shard / Git's role / orchestrator-vs-engine).
|
||||
|
||||
33
research/260614-squeak-pharo-deep-dive/README.md
Normal file
33
research/260614-squeak-pharo-deep-dive/README.md
Normal file
@@ -0,0 +1,33 @@
|
||||
# 260614 — Squeak & Pharo (image-based Smalltalk) deep dive
|
||||
|
||||
Date: 2026-06-14 · Source: **SHARD-WP-0004 T6 (Squeak) + T8 (Pharo)** — combined memo
|
||||
(justified merge: both are image-based Smalltalks; Pharo is T8's thin "context for T6/T7").
|
||||
|
||||
## What this is
|
||||
|
||||
A deep dive into the **image-based live-object** environment — Squeak and Pharo (the
|
||||
substrate Glamorous Toolkit T7 runs on): the **image** as a persistent world of live objects
|
||||
with **no file/document/app boundary**, the live **inspector**, and Pharo's retreat to
|
||||
**code-as-text in git** (Tonel/Iceberg).
|
||||
|
||||
## Why it matters
|
||||
|
||||
- The **purest "live" end** of the batch's spectrum (literate source → notebook snapshot →
|
||||
GT/Lepiter live-over-files → **image: everything live**). Names the **live↔snapshot** axis
|
||||
the projection model (T16) must carry.
|
||||
- Hardens the **image-is-not-a-store** boundary (opaque monolithic non-diffable blob; no
|
||||
page identity/history/provenance) — generalizes "attach files, not the kernel/image"
|
||||
(UC-84, T7) into a named binding boundary (T14).
|
||||
- Pharo **confirms** the resolution: even image traditions externalize to **git-versionable
|
||||
text** (Tonel) to version/collaborate — files-canonical from the Smalltalk side.
|
||||
|
||||
## Yield
|
||||
|
||||
- **No new UC** (boundary / design prior art; covers T6 and T8). Boundary for UC-34/35/79;
|
||||
links UC-83/84 (live→snapshot), UC-54/47/48 (live-object inspection), UC-76/79 (Tonel/git).
|
||||
|
||||
## Contents
|
||||
|
||||
| Path | Role |
|
||||
|------|------|
|
||||
| `findings.md` | The image & live objects, the image-as-store anti-pattern, Pharo Tonel/Iceberg→git, INTENT mapping, UC disposition (enrichment-only), architecture notes (T14 boundary, T16 live↔snapshot axis) |
|
||||
141
research/260614-squeak-pharo-deep-dive/findings.md
Normal file
141
research/260614-squeak-pharo-deep-dive/findings.md
Normal file
@@ -0,0 +1,141 @@
|
||||
# Squeak & Pharo (image-based Smalltalk) — deep dive (findings)
|
||||
|
||||
**Date:** 2026-06-14 · **Source:** SHARD-WP-0004 **T6 (Squeak)** + **T8 (Pharo)** — combined
|
||||
(justified merge: both are image-based Smalltalks; Pharo is T8's "context for T6/T7," kept
|
||||
brief per the workplan). · **Subject:** the **image-based live-object** environment and what
|
||||
it teaches (and warns) about shard-wiki's page model.
|
||||
|
||||
## Why this dive (and why merged)
|
||||
|
||||
Squeak (1996, the Alan Kay/Ingalls lineage) and Pharo (2008 fork, the substrate **Glamorous
|
||||
Toolkit** T7 runs on) are the same idea: **a persistent world of live objects, the "image,"
|
||||
with no file/document/application boundary.** They are **not candidate shards** — they are
|
||||
the **anti-pattern boundary** for shard-wiki's files-canonical stance *and* the inspiration
|
||||
behind moldable inspection (T7). T8 (Pharo) is folded here as the workplan allows: its
|
||||
distinct contribution over Squeak (Tonel/Iceberg file-based code → git) is the one piece that
|
||||
*does* touch our concerns, covered in §3.
|
||||
|
||||
## 1. The image: knowledge-as-live-objects
|
||||
|
||||
- A Smalltalk **image** is a serialized snapshot of the **entire object memory** — every
|
||||
object, class, tool, window, and the running program state — persisted as one binary file
|
||||
(`.image` + `.changes` log). You resume exactly where you left off.
|
||||
- **"Everything is a live object"**: code, data, UI (Morphic), the debugger, the inspector —
|
||||
all are objects you can open, message, and modify *in place*, while running. There is **no
|
||||
edit/compile/run cycle** and **no document-vs-app distinction**.
|
||||
- The **inspector** lets you open any object and explore/modify its state live — the direct
|
||||
ancestor of GT's moldable inspector (T7), but generic rather than domain-molded.
|
||||
|
||||
This is the **purest "live" end** of the spectrum the whole batch traverses: literate source
|
||||
(static) → notebook captured output (snapshot) → GT/Lepiter (live results over files) →
|
||||
**image (everything live, nothing inherently a file).**
|
||||
|
||||
## 2. The boundary: image-as-store is the anti-pattern
|
||||
|
||||
The image directly **contradicts** shard-wiki's design constraints, which is exactly why it's
|
||||
worth recording as a hard boundary:
|
||||
|
||||
- **Opaque, monolithic, non-diffable.** The image is one big binary blob of entangled live
|
||||
state — no per-page identity, no text diff, no mergeable history, no provenance per unit.
|
||||
It violates *union-without-erasure granularity*, *Markdown-first*, and *git-addressable
|
||||
coordination*.
|
||||
- **No stable addressable "page."** Knowledge is an object graph in memory, not addressable
|
||||
documents — there is nothing to attach at page granularity without an export step.
|
||||
- **History is a `.changes` log**, a serial source-change stream, not a content history.
|
||||
|
||||
**Conclusion (boundary recorded):** an image is **not a shard and not a store**. This is the
|
||||
generalized form of the rule already hit at Jupyter (UC-84) and GT (T7): *attach the
|
||||
exported files, never the live image/kernel.* The image can only participate via an **export
|
||||
projection** (objects/code → files), which is a **derivation-projection** (T1) that
|
||||
**degrades the liveness to a static snapshot**.
|
||||
|
||||
## 3. Pharo's twist: code-as-files (Tonel) → git (the one actionable bit)
|
||||
|
||||
Pharo (T8) matters precisely where it **retreats from pure-image**:
|
||||
|
||||
- **Tonel / FileTree** serialize each class/method as **plain-text files** in a directory,
|
||||
and **Iceberg** manages those files as a **git repository** — so Pharo code lives in git as
|
||||
text, diffable and mergeable, *outside* the image.
|
||||
- This is the **same move** as Lepiter (T7), nbstripout/Jupytext (T3), and ikiwiki source
|
||||
(UC-79): **the durable, attachable artifact is the file representation; the live
|
||||
environment is a layer above it.** It confirms our stance from the *Smalltalk* side: even
|
||||
the most image-centric tradition externalizes to **git-versionable text** to collaborate.
|
||||
|
||||
So Pharo adds **no new page-model idea** beyond "image-based environments still externalize
|
||||
to git text" — exactly the "keep brief / fold" expectation. Its value is **confirming the
|
||||
boundary resolution**: attach the **Tonel/git source**, treat the image as live-only.
|
||||
|
||||
## 4. INTENT mapping
|
||||
|
||||
### Inspiration (keep)
|
||||
|
||||
- **Live-object inspection** is the seed of moldable views (T7/UC-54): the *idea* that any
|
||||
unit can be opened and explained interactively. shard-wiki adopts this as **projection/
|
||||
view**, not as a storage model.
|
||||
- **Resume-where-you-left-off** liveness names the far end of the **live↔snapshot** axis the
|
||||
contract must place every computed/projected view on (UC-83/84): the more live the source,
|
||||
the more its attached form is a **degrading snapshot**.
|
||||
|
||||
### Boundary (enforce — design-bug if violated)
|
||||
|
||||
- **Image-as-store is a design-bug boundary.** Never model an image (or any monolithic live
|
||||
memory blob) as a shard/store; participate only via **export → files** (a degrading
|
||||
derivation-projection). Generalizes "attach files, not the kernel/image" (UC-84, T7).
|
||||
|
||||
### Confirmation (Pharo)
|
||||
|
||||
- Even pure-image traditions externalize to **git-versionable text** (Tonel/Iceberg) to
|
||||
version and collaborate — reinforcing **files-canonical + git coordination** as the
|
||||
durable substrate; the live environment sits above it.
|
||||
|
||||
## 5. UC disposition (enrichment-only — no new UC)
|
||||
|
||||
| Mechanism (findings §) | Catalog UC / thread |
|
||||
|------------------------|---------------------|
|
||||
| Live-object inspector = generic ancestor of moldable views (§1) | links UC-54, UC-47/48 (T7) |
|
||||
| Image = opaque monolithic non-diffable blob; not a page/store (§2) | **boundary** for UC-34/UC-35/UC-79 (granularity, identity, files-canonical) |
|
||||
| Image participates only via export→files = degrading derivation-projection (§2) | links UC-83, UC-84 (live→snapshot) |
|
||||
| Pharo Tonel/Iceberg: code-as-text in git (§3) | links UC-79, UC-76 (git-canonical text) |
|
||||
| `.changes` = serial source-change log, not content history (§1) | links UC-36 (history shape) |
|
||||
|
||||
Both Squeak and Pharo are **design prior art / boundary**, not candidate shards → **no new
|
||||
UC**. They sharpen the **live↔snapshot** axis and harden the **image-is-not-a-store**
|
||||
boundary; Pharo confirms even image traditions externalize to git text.
|
||||
|
||||
## 6. Architecture notes for SHARD-WP-0002
|
||||
|
||||
- **T14 (attach binding):** record the **image-is-not-a-store** boundary explicitly — a
|
||||
monolithic live-memory blob is never an attach target; participation is via **export→files**
|
||||
only. Generalize the "attach files, not the kernel/image" rule (UC-84, GT T7) to a named
|
||||
boundary in the binding taxonomy.
|
||||
- **T16 (projection):** add the **live↔snapshot** axis to the projection model — every
|
||||
computed/projected view sits somewhere between "live (re-derivable on demand)" and "static
|
||||
snapshot," and the more live the source, the more its attached form must be a clearly-
|
||||
marked degrading snapshot.
|
||||
- **T11/T12:** the live-object inspector is the *inspiration* for the moldable view registry
|
||||
(T7), not a storage shape; nothing new to the page model itself.
|
||||
|
||||
## 7. Open questions
|
||||
|
||||
1. Is **live↔snapshot** an explicit, first-class metadata axis on every projection (so the
|
||||
union can label "this view was live / is a snapshot from time T"), or implicit per
|
||||
capability? (Recurs across UC-83/84, GT, Mathematica `Dynamic`, Strudel T5.)
|
||||
2. Do we ever ingest a Smalltalk project by attaching its **Tonel/git** repo as an ordinary
|
||||
git-text shard (no Smalltalk-specific adapter needed), confirming the boundary resolution?
|
||||
|
||||
## 8. Sources
|
||||
|
||||
- Squeak: `squeak.org`, the *Back to the Future* (Squeak) paper (Ingalls et al.), Morphic;
|
||||
image/`.changes` model.
|
||||
- Pharo: `pharo.org`, **Tonel** format, **Iceberg** (git integration), FileTree.
|
||||
- prior: `research/260614-glamorous-toolkit-deep-dive/` (moldable inspector, Lepiter, T7);
|
||||
`research/260614-jupyter-deep-dive/` (live→snapshot boundary, UC-84).
|
||||
|
||||
## 9. Traceability
|
||||
|
||||
**No new UC** (boundary / design prior art; covers **T6 Squeak** and **T8 Pharo** in one
|
||||
justified-merge memo). Boundary hardened for: UC-34/UC-35/UC-79 (image-is-not-a-store);
|
||||
links UC-83/UC-84 (live→snapshot), UC-54/UC-47/UC-48 (live-object inspection → moldable
|
||||
views), UC-76/UC-79 (Pharo Tonel/git text), UC-36 (`.changes` history shape). Architecture
|
||||
cross-refs: SHARD-WP-0002 T14 (image-is-not-a-store boundary; export→files only), T16
|
||||
(live↔snapshot projection axis).
|
||||
30
research/260614-strudel-deep-dive/README.md
Normal file
30
research/260614-strudel-deep-dive/README.md
Normal file
@@ -0,0 +1,30 @@
|
||||
# 260614 — Strudel.cc (live-coding REPL) deep dive
|
||||
|
||||
Date: 2026-06-14 · Source: **SHARD-WP-0004 T5**
|
||||
|
||||
## What this is
|
||||
|
||||
A deep dive into **Strudel.cc** (the JavaScript port of **TidalCycles**): a browser
|
||||
**live-coding REPL** where terse **pattern source** is **evaluated live into time-based
|
||||
audio** — code as a running musical performance, with **no document, no output cell, no file
|
||||
of results**.
|
||||
|
||||
## Why it matters
|
||||
|
||||
- The **extreme of the live↔snapshot axis** (named at T6): output is **temporal, generative,
|
||||
performative**, so there is **no faithful static form** — the best static projection is
|
||||
**source (canonical, diffable) + an optional audio recording snapshot**, marked as one
|
||||
performance. The **honesty test** for union-without-erasure + graceful degradation.
|
||||
- Bounds the projection model's live end: ahead-of-time → view-time one-shot → continuous →
|
||||
**irreducibly live/temporal (recording-only)**.
|
||||
|
||||
## Yield
|
||||
|
||||
- **No new UC** (enrichment / design prior art; far live end). Enriches **UC-54, UC-55**;
|
||||
links UC-83/84, UC-37, UC-35.
|
||||
|
||||
## Contents
|
||||
|
||||
| Path | Role |
|
||||
|------|------|
|
||||
| `findings.md` | Code-as-live-performance, the limit of static projection, INTENT mapping, UC disposition (enrichment-only), architecture notes (T16 far end of live↔snapshot axis) |
|
||||
118
research/260614-strudel-deep-dive/findings.md
Normal file
118
research/260614-strudel-deep-dive/findings.md
Normal file
@@ -0,0 +1,118 @@
|
||||
# Strudel.cc — live-coding REPL — deep dive (findings)
|
||||
|
||||
**Date:** 2026-06-14 · **Source:** SHARD-WP-0004 T5 · **Subject:** Strudel.cc (the
|
||||
JavaScript port of TidalCycles) — a browser **live-coding REPL** where code is a **running
|
||||
musical performance**.
|
||||
|
||||
## Why this dive
|
||||
|
||||
The closing, lowest-priority dive — and the **extreme** of the live↔snapshot axis. Where
|
||||
Processing (T4) renders *visual* output at view time, Strudel renders **time-based audio**
|
||||
that only exists *while running and evolving*. It is the limit case for "can a page be a live
|
||||
computation?" — the answer where **static projection is least faithful**. Enrichment-only;
|
||||
feeds UC-54/55 and hardens the snapshot-degradation rule.
|
||||
|
||||
## 1. Code as a live, evaluated performance
|
||||
|
||||
- **Strudel** ports **TidalCycles**' pattern language to JavaScript, running entirely in the
|
||||
**browser** (Web Audio). You write **pattern expressions** (e.g. `note("c e g")`,
|
||||
`sound("bd sd")` with transformations) and **evaluate them live**; the sound updates
|
||||
**without stopping** — the essence of *live coding*.
|
||||
- The artifact is **terse source text** (a pattern); the "content" is the **sound it
|
||||
produces over time**. There is **no document, no output cell, no file of results** —
|
||||
output is **ephemeral, temporal, and performative**.
|
||||
- A Strudel "page" (a shared REPL link / snippet) is **source + the implicit promise of a
|
||||
running evaluation**. The source is tiny and diffable; the experience is not capturable as
|
||||
text.
|
||||
|
||||
## 2. The limit of static projection
|
||||
|
||||
Strudel pushes past Processing on every "live" dimension:
|
||||
|
||||
- **Temporal & generative** — output unfolds over time and may be **non-deterministic**
|
||||
(randomness, evolving state). There is no single "frame"; the faithful capture is a
|
||||
**recording (audio), itself just one performance**, not the artifact.
|
||||
- **Performative** — the value is partly the **act of live editing**; even a recording loses
|
||||
the live-coding dimension.
|
||||
- So on the **live↔snapshot axis** (named at T6), Strudel sits at the **far live end**: the
|
||||
best static projection is **(a) the source** (canonical, diffable) **+ (b) an optional
|
||||
audio recording snapshot**, explicitly marked as one rendering of a live/temporal artifact.
|
||||
|
||||
This makes Strudel the **honesty test** for the contract: shard-wiki must be able to attach
|
||||
such a source, present it truthfully (here is the source; a live render needs the runtime; a
|
||||
recording is one performance), and **never imply a static page captures it**.
|
||||
|
||||
## 3. INTENT mapping (enrichment-only — no new UC)
|
||||
|
||||
### Reinforcements / refinements
|
||||
|
||||
- **Live-evaluated, time-based content (UC-54/55).** Strudel is the extreme executable-as-
|
||||
page: **source canonical, presentation = a temporal live evaluation**. Confirms the page
|
||||
model must represent content whose rendered form is **time-based / generative / performative**.
|
||||
- **live↔snapshot axis (T6) — far end.** Establishes the **upper bound**: some content is
|
||||
**irreducibly live**; static projection degrades to **source + a recording snapshot**, with
|
||||
honesty about what's lost. Generalizes Processing's "snapshot frame" to "recording of one
|
||||
performance."
|
||||
- **Graceful degradation (INTENT).** A backend that can't run the REPL still serves the
|
||||
**source** (tiny, diffable) and any **recording** as read/projection/backup — the
|
||||
limited-backend-still-usable rule, at the hardest content type.
|
||||
- **Union without erasure.** Presenting a Strudel shard must surface "this is a **live
|
||||
temporal artifact**; what you see/hear statically is **source / one recording**" — never
|
||||
hide the liveness or imply completeness.
|
||||
|
||||
### Boundaries
|
||||
|
||||
- shard-wiki is **not an audio/REPL runtime**; default = attach the **source** + offer/store
|
||||
a **recording** with provenance; live in-viewer evaluation is a gated capability (trust/
|
||||
sandbox, like Processing T4). Source is canonical; everything rendered is a degrading,
|
||||
view-time/temporal projection.
|
||||
|
||||
## 4. UC disposition (enrichment-only)
|
||||
|
||||
| Mechanism (findings §) | Catalog UC / thread |
|
||||
|------------------------|---------------------|
|
||||
| Pattern source = live, evaluated, time-based performance (§1) | UC-54 / UC-55 (enriched: time-based/generative executable content) |
|
||||
| Output ephemeral/temporal/non-deterministic → no faithful static form (§2) | links live↔snapshot axis (T6), far end |
|
||||
| Best static projection = source + audio recording snapshot, marked as one performance (§2) | links UC-83/UC-84 (degrading projection), UC-37 (recording as backup) |
|
||||
| Limited backend still serves source + recording (§3) | links UC-37 graceful degradation |
|
||||
| Live in-viewer evaluation = capability + trust/sandbox (§3) | links UC-35 |
|
||||
|
||||
**No new UC** — Strudel is design prior art marking the **far live end** of the projection/
|
||||
liveness model; it adds no orchestration scenario, it bounds one.
|
||||
|
||||
## 5. Architecture notes for SHARD-WP-0002
|
||||
|
||||
- **T16 (projection):** anchor the **far end of the live↔snapshot axis** — content can be
|
||||
**irreducibly live/temporal/generative**; the contract must allow a projection to declare
|
||||
"no faithful static form; static = source + a marked recording." Combined with T4's
|
||||
**materialization-timing** and **continuity** facets, the projection model now spans:
|
||||
ahead-of-time materialized → view-time one-shot → view-time continuous/interactive →
|
||||
**temporal/generative/performative (recording-only snapshot)**.
|
||||
- **T12/T15 (page model):** **time-based / generative executable content** as a page-model
|
||||
edge; source canonical, render temporal.
|
||||
- **T11 (capabilities):** "live-evaluate (audio/REPL) in viewer" capability + trust/sandbox;
|
||||
default off → source + recording.
|
||||
|
||||
## 6. Open questions
|
||||
|
||||
1. Does the contract carry an explicit **"irreducibly live / no faithful static form"** flag
|
||||
on a projection (so the union renders the honest fallback automatically)? (The far-end
|
||||
resolution of the live↔snapshot axis open question from T6.)
|
||||
2. Is a **recording** modeled as a stored **derivation-projection snapshot with provenance**
|
||||
(one performance, time T, source rev R), reusing the UC-84 snapshot machinery?
|
||||
|
||||
## 7. Sources
|
||||
|
||||
- `strudel.cc` (docs, REPL); **TidalCycles** (`tidalcycles.org`) — the pattern language
|
||||
Strudel ports; Web Audio.
|
||||
- prior: `research/260614-processing-deep-dive/` (view-time render, continuity, UC-54/55);
|
||||
`research/260614-squeak-pharo-deep-dive/` (live↔snapshot axis);
|
||||
`research/260614-mathematica-deep-dive/` (`Dynamic` interactive, snapshot-only).
|
||||
|
||||
## 8. Traceability
|
||||
|
||||
**No new UC** (enrichment / design prior art; the far live end). Enriched: UC-54, UC-55;
|
||||
links UC-83/UC-84 (degrading projection), UC-37 (recording = backup / graceful degradation),
|
||||
UC-35 (live-evaluate capability + trust). Architecture cross-refs: SHARD-WP-0002 T16 (far end
|
||||
of live↔snapshot axis: irreducibly-live content, static = source + marked recording), T12/T15
|
||||
(time-based/generative executable content), T11 (live-evaluate capability + sandbox).
|
||||
16
research/260614-tiddlywiki-deep-dive/README.md
Normal file
16
research/260614-tiddlywiki-deep-dive/README.md
Normal file
@@ -0,0 +1,16 @@
|
||||
# 260614 — TiddlyWiki deep dive
|
||||
|
||||
Deep dive on **TiddlyWiki** (TW5): an entire wiki — tiddlers **plus** the app engine — in
|
||||
**one self-contained HTML file**, the **whole-file write-granularity** anchor of the
|
||||
synthesis matrix, with a Node.js **file-per-tiddler** (`.tid`) substrate as the git-diffable
|
||||
alternative, a tiddler/field record model, and **filter expressions** as the native query
|
||||
language.
|
||||
|
||||
- `findings.md` — the single-file model, tiddler data model, dual substrate, filters,
|
||||
capability profile, INTENT mapping, UC seed (UC-78), architecture notes for SHARD-WP-0002,
|
||||
open questions, sources, traceability.
|
||||
|
||||
Catalog yield: UC-78 (attach a single-file self-contained wiki as one shard — parse tiddlers
|
||||
out, project; write = rewrite the whole file, the coarsest write-granularity anchor).
|
||||
Enriched UC-35/40/34/52/43. Feeds SHARD-WP-0002 T11 (write-granularity extreme) and T14
|
||||
(single-file vs file-per-tiddler binding).
|
||||
164
research/260614-tiddlywiki-deep-dive/findings.md
Normal file
164
research/260614-tiddlywiki-deep-dive/findings.md
Normal file
@@ -0,0 +1,164 @@
|
||||
# TiddlyWiki — deep dive (findings)
|
||||
|
||||
**Date:** 2026-06-14 · **Source:** SHARD-WP-0003 T3 · **Subject:** TiddlyWiki / TiddlyWiki5
|
||||
(TW5), Jeremy Ruston's self-contained personal wiki.
|
||||
|
||||
## Why this dive
|
||||
|
||||
The synthesis matrix names **whole-file write granularity** as one extreme of the
|
||||
write-granularity spectrum, anchored by TiddlyWiki. This dive confirms the anchor and the
|
||||
*portability extreme* it implies — a wiki that is **a single HTML file you can email** — and
|
||||
finds the twist: TiddlyWiki also has a **Node.js file-per-tiddler** substrate, so it spans
|
||||
the granularity spectrum the way Logseq spans file/DB (UC-62). The question: how does
|
||||
shard-wiki attach a backend whose *entire content is one file*?
|
||||
|
||||
## 1. The single-file model
|
||||
|
||||
Classic TiddlyWiki ships as **one `.html` file** that contains **both**:
|
||||
|
||||
1. the **TiddlyWiki core** (the JavaScript engine, parser, renderer, UI), and
|
||||
2. **every tiddler** (all content), serialized into the file.
|
||||
|
||||
Open it in a browser and the file *is* the running application. There is **no server and no
|
||||
build step** — the app reconstitutes itself from the file it was loaded from. This is the
|
||||
portability extreme: a complete, self-hosting wiki in a single, emailable, USB-stick-able
|
||||
artifact that runs offline anywhere.
|
||||
|
||||
**Saving** is the catch: a browser page cannot normally overwrite the file it came from, so
|
||||
TiddlyWiki uses **"savers"** — TiddlyFox/browser extension, the File System Access API, a
|
||||
Node.js server, TiddlySpot/put-savers, or "download a new copy." Crucially, **a save
|
||||
rewrites the *entire* HTML file** (core + all tiddlers re-serialized). Hence **whole-file
|
||||
write granularity**: there is no concept of writing one page in isolation in the
|
||||
single-file mode — every save touches the whole artifact.
|
||||
|
||||
## 2. The tiddler data model
|
||||
|
||||
The atomic unit is the **tiddler** — a named record with **fields**:
|
||||
|
||||
- Core fields: **`title`** (the identity), **`text`** (the body), **`tags`**, **`created`**,
|
||||
**`modified`**, **`type`** (content type, e.g. `text/vnd.tiddlywiki`, `text/markdown`),
|
||||
plus **arbitrary custom fields** (any key→value). A tiddler is effectively a **flexible
|
||||
flat record** — closer to a typed-field record (UC-34) than to prose-with-frontmatter.
|
||||
- **Everything is a tiddler**: not just pages, but tags, macros, templates, themes, plugins,
|
||||
and the wiki's own configuration are all tiddlers. A **plugin is a bundle of tiddlers**.
|
||||
- Content markup is **WikiText** (TW5's own), though `type` can mark a tiddler as Markdown,
|
||||
JSON, image, etc. **Transclusion** is native: `{{SomeTiddler}}` embeds another tiddler;
|
||||
`{{SomeTiddler!!field}}` embeds a field.
|
||||
|
||||
## 3. The dual substrate — single-file vs file-per-tiddler
|
||||
|
||||
TiddlyWiki on **Node.js** stores each tiddler as a **separate `.tid` file** on disk: a small
|
||||
text file with a header of `field: value` lines, a blank line, then the body. The Node
|
||||
server assembles these into the same wiki at serve time. This substrate is:
|
||||
|
||||
- **git-diffable and fine-grained** — one file per tiddler, line-level diffs, per-tiddler
|
||||
history — the *opposite* end of the granularity spectrum from the single HTML file.
|
||||
- the natural attach surface for a *versioned, multi-author* TiddlyWiki.
|
||||
|
||||
So TiddlyWiki **spans the write-granularity spectrum by substrate** (single-file = whole-file
|
||||
write; Node = file-per-tiddler write), exactly as Logseq spans file/DB (UC-62) and as the
|
||||
backend-swap question (UC-43) anticipates.
|
||||
|
||||
## 4. Native query — filter expressions
|
||||
|
||||
TiddlyWiki's query language is **filter expressions** over tiddler fields, e.g.
|
||||
`[tag[todo]!tag[done]sort[modified]]` — a compact DSL that selects/orders tiddlers by field
|
||||
and tag. Lists, tables, and dynamic views are built from filters. This is a **native-query
|
||||
capability** (UC-52 tier) — less expressive than SPARQL/datalog but real, and computed over
|
||||
the tiddler store.
|
||||
|
||||
## 5. Capability profile
|
||||
|
||||
| Dimension (synthesis spectrum) | TiddlyWiki (single-file) | TiddlyWiki (Node `.tid`) |
|
||||
|--------------------------------|--------------------------|--------------------------|
|
||||
| Attachment mode | **file-store: one HTML file** | **file-store: dir of `.tid` files** |
|
||||
| Addressing granularity | tiddler (`title`) within the file | tiddler = one file |
|
||||
| Content identity | **`title`** (placement-bound) | title ↔ filename |
|
||||
| Structure | flat record store w/ arbitrary **fields** + tags | same |
|
||||
| History | none in-file (whole-file save) | **per-file git history** |
|
||||
| Merge model | whole-file replace (no merge) | git 3-way per tiddler |
|
||||
| Native query | **filter expressions** | filter expressions |
|
||||
| Translation | WikiText (or per-tiddler `type`: markdown/json/…) | same |
|
||||
| **Write granularity** | **whole file** (the anchor) | **file per tiddler** |
|
||||
| Operational envelope | trivial — a browser; no server | a Node server |
|
||||
| Access grant | file access = full access | server/file perms |
|
||||
| Content opacity | transparent (parse the HTML store) | transparent text |
|
||||
| Provenance | created/modified fields | git + fields |
|
||||
|
||||
## 6. INTENT mapping
|
||||
|
||||
### Reinforcements
|
||||
|
||||
- **Graceful degradation**: a single-file TiddlyWiki is a *trivial* read-only / projection /
|
||||
backup shard — parse the tiddlers out of the HTML, project pages; no server needed. The
|
||||
limited-backend-still-usable principle at its simplest.
|
||||
- **Markdown-first but backend-neutral**: tiddlers carry a `type`, so Markdown tiddlers
|
||||
coexist with WikiText — the page model's content-type field maps directly.
|
||||
- **Typed fields** (UC-34): arbitrary tiddler fields are a flexible record model the page
|
||||
model already accommodates.
|
||||
- **Backend-swap under stable identity** (UC-43): single-file ↔ Node `.tid` is the same
|
||||
logical wiki on two substrates — the migration UC-43 anticipates, within one engine.
|
||||
|
||||
### Divergences (boundaries / notes)
|
||||
|
||||
- **Whole-file write granularity** is a real constraint: in single-file mode shard-wiki
|
||||
cannot write one page atomically — an overlay applied "to one page" still **rewrites the
|
||||
whole file** (T11). This is the coarsest write tier; model it explicitly so overlays/locks
|
||||
account for it (a write to any page conflicts with any concurrent write).
|
||||
- **Identity = title**, file-local; cross-shard identity (T16) layered above.
|
||||
- **The app is in the file**: when parsing a single-file TiddlyWiki, shard-wiki must extract
|
||||
the **tiddler store** and ignore the embedded engine — i.e. treat the HTML as a *container
|
||||
format*, not as page content (don't mistake the app for content).
|
||||
|
||||
### What to keep
|
||||
|
||||
1. **Single-file self-contained wiki as a first-class file-store shard** — container-format
|
||||
parse, whole-file write granularity (UC-78); the portability/granularity extreme.
|
||||
2. **Whole-file write granularity as a named tier** (T11) with overlay/lock implications.
|
||||
3. **Dual-substrate binding** (single-file vs `.tid` dir) as another instance of
|
||||
substrate-choice under one identity (UC-43/UC-62).
|
||||
|
||||
## 7. UC seed
|
||||
|
||||
| # | Seed | Disposition |
|
||||
|---|------|-------------|
|
||||
| UC-78 | Attach a **single-file self-contained wiki** (TiddlyWiki HTML) as one shard — parse tiddlers out of the container, project pages; **write = rewrite the whole file** (whole-file write granularity, the coarsest tier) | **new** |
|
||||
| — | whole-file write granularity anchor + overlay/lock implications | enrich **UC-35** |
|
||||
| — | single HTML file as a file-store shard (container format) | enrich **UC-40** |
|
||||
| — | tiddler arbitrary fields = flexible record | enrich **UC-34** |
|
||||
| — | filter expressions as a native-query tier | enrich **UC-52** |
|
||||
| — | single-file ↔ Node `.tid` substrate swap | enrich **UC-43** |
|
||||
|
||||
## 8. Architecture notes for SHARD-WP-0002
|
||||
|
||||
- **T11 (capability / write granularity):** confirm **whole-file** as the coarsest named
|
||||
write tier (anchored by single-file TiddlyWiki), with the implication that an overlay to
|
||||
*any* page conflicts with concurrent writes (no per-page atomicity). File-per-tiddler is
|
||||
the fine tier on the same engine.
|
||||
- **T14 (attach binding):** a single-file wiki binds as a **container-format file-store**
|
||||
(parse tiddler store, ignore embedded engine); a Node TiddlyWiki binds as a **dir of
|
||||
`.tid` files** (git-diffable). One engine, two bindings — parameterize like UC-43.
|
||||
- **Native query:** filter expressions are a low-mid native-query tier between "none" and
|
||||
datalog/SPARQL — delegate where present (UC-52).
|
||||
|
||||
## 9. Open questions
|
||||
|
||||
1. In single-file mode, how does shard-wiki represent **per-page overlays** when writes are
|
||||
whole-file — buffer overlays and re-serialize, or require the Node `.tid` substrate for
|
||||
write-through and treat single-file as read/projection/backup only?
|
||||
2. Is a single-file TiddlyWiki's **embedded plugins/config** ever relevant to the union, or
|
||||
strictly ignored as app-internals (parse only content tiddlers)?
|
||||
3. Does shard-wiki expose tiddler **filter expressions** as a delegated query, or only its
|
||||
own union query over projected tiddlers?
|
||||
|
||||
## 10. Sources
|
||||
|
||||
- TiddlyWiki.com — *Tiddlers*, *TiddlerFields*, *Filters*, *Saving*, *Node.js* docs
|
||||
- *TiddlyWiki5* GitHub (Jermolene/TiddlyWiki5) — `.tid` file format, store structure
|
||||
- prior: `research/260614-logseq-deep-dive/` (file/DB dual substrate, UC-62)
|
||||
|
||||
## 11. Traceability
|
||||
|
||||
New UC **UC-78** carries the marker **⊡** in the wikiengines column of
|
||||
`spec/UseCaseCatalog.md`. Enriched: UC-35, UC-40, UC-34, UC-52, UC-43. Architecture
|
||||
cross-refs: SHARD-WP-0002 T11 (whole-file tier), T14 (dual binding).
|
||||
14
research/260614-usemodwiki-deep-dive/README.md
Normal file
14
research/260614-usemodwiki-deep-dive/README.md
Normal file
@@ -0,0 +1,14 @@
|
||||
# 260614 — UseModWiki deep dive
|
||||
|
||||
Deep dive on **UseModWiki**: the **flat-file ancestor** — a Perl CGI wiki (Clifford Adams,
|
||||
2000) descended from AtisWiki/CvWiki, **CamelCase** linking, plain flat-file page + history
|
||||
storage, and the **engine Wikipedia originally ran on** (MediaWiki Phase I). Origins/lineage
|
||||
value: the minimal flat-file page+history model the whole field descends from.
|
||||
|
||||
- `findings.md` — the model, lineage, capability profile, INTENT mapping, enrichments (no new
|
||||
UC — reinforces the minimal flat-file baseline UC-82), architecture notes, sources,
|
||||
traceability.
|
||||
|
||||
Catalog yield: **enrichment-only** (reinforces UC-82) — enriched UC-01 (open wiki), UC-40
|
||||
(flat-file store), UC-25 (CamelCase naming), UC-36/41 (flat-file history). Lineage noted for
|
||||
the origins record. Feeds SHARD-WP-0002 T11 (minimal profile lineage).
|
||||
87
research/260614-usemodwiki-deep-dive/findings.md
Normal file
87
research/260614-usemodwiki-deep-dive/findings.md
Normal file
@@ -0,0 +1,87 @@
|
||||
# UseModWiki — deep dive (findings)
|
||||
|
||||
**Date:** 2026-06-14 · **Source:** SHARD-WP-0003 T9 · **Subject:** UseModWiki, Clifford
|
||||
Adams's flat-file Perl wiki — the lineage root of much of the field.
|
||||
|
||||
## Why this dive
|
||||
|
||||
This is a **lineage** dive, not a new-capability one. UseModWiki is the **ancestor**: the
|
||||
minimal flat-file page+history wiki the c2-era engines and **MediaWiki Phase I** descend
|
||||
from (Wikipedia ran on UseModWiki, 2001–2002). It pairs with
|
||||
`research/260608-c2-wiki-origins/` to record *where the page+history model came from*. It
|
||||
adds no new shard capability beyond the minimal flat-file floor (UC-82, Oddmuse) — its value
|
||||
is **historical grounding** and confirming the floor is genuinely the field's common root.
|
||||
|
||||
## 1. The model
|
||||
|
||||
- **Perl CGI** (`wiki.pl`), single-script, descended from AtisWiki → CvWiki → UseModWiki
|
||||
(c. 2000), by Clifford Adams.
|
||||
- **Flat-file storage**: each page stored as a text file (under a `db/` data directory),
|
||||
with **page history** kept (older revisions retained as files / in the page record). No
|
||||
database.
|
||||
- **Linking**: **CamelCase** (`WikiWord`) auto-links by default; later added **free links**
|
||||
`[[Like This]]`. This is the canonical CamelCase-naming lineage (UC-25).
|
||||
- **Open editing**, recent-changes, simple markup — the c2 ethos in a portable script.
|
||||
|
||||
## 2. Lineage
|
||||
|
||||
- **MediaWiki Phase I** *was* UseModWiki — early Wikipedia ran on it before the PHP rewrite
|
||||
(Phase II/III). So UseModWiki is the **direct ancestor of MediaWiki/Wikibase** (T2) and a
|
||||
sibling-root of the c2 tradition.
|
||||
- The **flat-file page+history model** here is the shape Oddmuse (UC-82), TWiki's file store
|
||||
(UC-40), and others elaborate — confirming the **minimal file-store floor** is the field's
|
||||
common origin, not a modern simplification.
|
||||
|
||||
## 3. Capability profile
|
||||
|
||||
Essentially **identical to the minimal floor** (UC-82, Oddmuse):
|
||||
|
||||
| Dimension | UseModWiki |
|
||||
|-----------|------------|
|
||||
| Attachment mode | **file-store** (flat files under `db/`); CGI, no API |
|
||||
| Addressing | page = file; **CamelCase** name = identity |
|
||||
| Structure | flat page space; CamelCase link graph |
|
||||
| History | flat-file retained revisions (may be limited) |
|
||||
| Native query | none |
|
||||
| Translation | simple wiki markup → Markdown (lossy) |
|
||||
| Write granularity | page (file) |
|
||||
| Access | open editing (optional admin password) |
|
||||
| Provenance | timestamp, optional username |
|
||||
|
||||
## 4. INTENT mapping
|
||||
|
||||
- **Reinforces the minimal flat-file baseline** (UC-82): UseModWiki is the *historical*
|
||||
instance of the graceful-degradation floor — attach via its flat files; partial history
|
||||
surfaced honestly.
|
||||
- **CamelCase naming** (UC-25): the canonical source of WikiWord auto-linking — the page
|
||||
model's name/identity and link-resolution must accommodate CamelCase-derived identities.
|
||||
- **Open wiki** (UC-01): the c2 open-editing ethos at the root.
|
||||
- **Lineage grounding**: confirms shard-wiki's "Git-based Markdown" page model descends from
|
||||
(and must remain attach-compatible with) the flat-file ancestor.
|
||||
|
||||
**No new UC** — UseModWiki adds historical grounding, not a new orchestration scenario;
|
||||
its capabilities are subsumed by UC-82 (minimal flat-file baseline) plus UC-25 (CamelCase).
|
||||
|
||||
## 5. Architecture notes for SHARD-WP-0002
|
||||
|
||||
- **T11:** UseModWiki = a second instance of the **minimal/floor profile** (with Oddmuse);
|
||||
confirms the floor is the field's common root, so the floor profile is the right baseline.
|
||||
- **Naming/identity:** CamelCase-derived page identities (UC-25) are part of the legacy
|
||||
identity surface the adapter must parse (and translate to the page model's identity).
|
||||
|
||||
## 6. Open questions
|
||||
|
||||
(None new — covered by UC-82 partial-history honesty, catalog open-Q30, and the c2 origins
|
||||
record. CamelCase resolution is UC-25.)
|
||||
|
||||
## 7. Sources
|
||||
|
||||
- UseModWiki — usemod.com/cgi-bin/wiki.pl (the wiki about itself); Wikipedia: *UseModWiki*,
|
||||
*MediaWiki* (Phase I history)
|
||||
- prior: `research/260608-c2-wiki-origins/`; `research/260614-oddmuse-deep-dive/` (minimal
|
||||
floor, UC-82)
|
||||
|
||||
## 8. Traceability
|
||||
|
||||
**No new UC** (reinforces UC-82). Enriched: UC-01, UC-40, UC-25, UC-36, UC-41. Lineage noted
|
||||
for the origins record. Architecture cross-ref: SHARD-WP-0002 T11 (minimal-profile lineage).
|
||||
16
research/260614-wikibase-deep-dive/README.md
Normal file
16
research/260614-wikibase-deep-dive/README.md
Normal file
@@ -0,0 +1,16 @@
|
||||
# 260614 — Wikibase / Wikidata deep dive
|
||||
|
||||
Deep dive on **Wikibase** (the MediaWiki extension behind **Wikidata**) as the
|
||||
**entity-statement (RDF) far end** of the structure and native-query spectra: items and
|
||||
properties, statements = claim + qualifiers + references + rank, stable opaque IDs, and
|
||||
**SPARQL** (Wikidata Query Service / Blazegraph) including **federated** queries.
|
||||
|
||||
- `findings.md` — data model, RDF/SPARQL surface, storage & history, capability profile,
|
||||
INTENT mapping, UC seeds (UC-73–UC-75), architecture notes for SHARD-WP-0002, open
|
||||
questions, sources, traceability.
|
||||
|
||||
Catalog yield: UC-73 (attach a typed entity-statement / RDF shard, lossy render to a page),
|
||||
UC-74 (graph query the union via SPARQL + federated `SERVICE` cross-endpoint query), UC-75
|
||||
(statement-level provenance — references + rank per assertion). Enriched UC-34/58/52/24.
|
||||
Feeds SHARD-WP-0002 T12 (structured/typed page model) and T16 (stable language-neutral
|
||||
identity ≠ label/placement).
|
||||
200
research/260614-wikibase-deep-dive/findings.md
Normal file
200
research/260614-wikibase-deep-dive/findings.md
Normal file
@@ -0,0 +1,200 @@
|
||||
# Wikibase / Wikidata — deep dive (findings)
|
||||
|
||||
**Date:** 2026-06-14 · **Source:** SHARD-WP-0003 T2 · **Subject:** Wikibase (MediaWiki
|
||||
extension) and its flagship instance **Wikidata**, incl. the Wikidata Query Service (SPARQL).
|
||||
|
||||
## Why this dive
|
||||
|
||||
Every structured shard so far tops out at *typed records in a database*: Notion's
|
||||
database-pages, XWiki's XObjects/classes, Trilium's typed relations, Roam/Logseq's
|
||||
attribute blocks. Wikibase is a *different kind of structure altogether* — a **typed
|
||||
knowledge graph of entities and provenance-bearing statements**, queried with **SPARQL**
|
||||
over an **RDF** projection. It is the **far end of the structure spectrum** and of the
|
||||
**native-query spectrum**, and it pushes **provenance down to the individual assertion**.
|
||||
The question for shard-wiki: what does a shard look like when its "page" is *not prose but a
|
||||
set of statements*, and what does the page model / adapter contract owe such a shard?
|
||||
|
||||
## 1. The data model — entities, statements, snaks
|
||||
|
||||
**Entities** are the top-level objects, each on its own MediaWiki page with a **stable
|
||||
opaque ID**:
|
||||
|
||||
- **Item** — `Q42`. Has multilingual **labels / descriptions / aliases**, a set of
|
||||
**statements**, and **sitelinks** (links to wiki articles). The label is *annotation*,
|
||||
not identity — `Q42` is the identity, "Douglas Adams" is just its English label.
|
||||
- **Property** — `P31` ("instance of"). Also has labels/descriptions/aliases, plus a
|
||||
**fixed datatype** constraining its values (item-reference, string, time,
|
||||
globe-coordinate, quantity, monolingual-text, url, external-id, commons-media, …).
|
||||
|
||||
**Statement** = the unit of assertion on an item. Structure:
|
||||
|
||||
```
|
||||
statement = claim + references[] + rank
|
||||
claim = mainSnak + qualifiers[]
|
||||
snak = property + snaktype + (value) # snaktype ∈ value | somevalue | novalue
|
||||
```
|
||||
|
||||
- **Main snak** — the core property→value assertion (e.g. `P31` → `Q5` "human").
|
||||
- **Qualifiers** — snaks that *contextualize* the claim without being the subject (validity
|
||||
time, "as of", determination method, units). E.g. *population (P1082) = 8.4M, **point in
|
||||
time (P585) = 2020***.
|
||||
- **References** — lists of snaks citing **where the claim comes from** (a source item, a
|
||||
URL, a page number). **Provenance attached to the individual statement, not the page.**
|
||||
- **Rank** — `preferred` | `normal` | `deprecated`: relative importance among same-property
|
||||
statements (lets multiple, even contradictory, values coexist with a curation signal —
|
||||
the structured analogue of fedwiki's "chorus").
|
||||
- Each statement carries a **stable GUID** (`Q42$<uuid>`), so statements are individually
|
||||
addressable.
|
||||
|
||||
`somevalue` (known to exist, value unknown) and `novalue` (known *not* to have a value) are
|
||||
**first-class** — the model represents *known-unknowns* explicitly, which prose and most
|
||||
DBs cannot.
|
||||
|
||||
## 2. The RDF / SPARQL surface
|
||||
|
||||
Wikibase **projects entities to RDF**; the **Wikidata Query Service (WDQS)** is a
|
||||
**Blazegraph** triple store exposing a **SPARQL** endpoint. The projection is deliberately
|
||||
multi-layered:
|
||||
|
||||
- **Truthy** triples (`wdt:` prefix) — the simple "best" value, for easy queries:
|
||||
`wd:Q42 wdt:P31 wd:Q5`.
|
||||
- **Full** statements — reified so qualifiers/references/rank survive: `wd:Q42 p:P31
|
||||
?stmt . ?stmt ps:P31 wd:Q5 ; pq:P585 ?time ; prov:wasDerivedFrom ?ref`. (`p:`=statement
|
||||
node, `ps:`=statement value, `pq:`=qualifier, `pr:`/`prov:`=reference.)
|
||||
- **Federated SPARQL** — the `SERVICE <endpoint> { … }` keyword runs a sub-query against
|
||||
*another* SPARQL endpoint and joins the results. **Query-level federation is built into
|
||||
the query language** — a different federation primitive from fedwiki's fork/neighborhood.
|
||||
- **EntitySchemas / ShEx** — schemas (`E`-ids) that *validate* an entity's shape (Shape
|
||||
Expressions). Optional, declarative structure validation over the open graph.
|
||||
|
||||
## 3. Storage, identity, history
|
||||
|
||||
- **Storage:** each entity is a **JSON blob stored as a MediaWiki page** (`Item:` /
|
||||
`Property:` content model). The RDF/SPARQL store is a **derived index** rebuilt from these
|
||||
canonical JSON entities (an *update stream* feeds WDQS) — exactly shard-wiki's
|
||||
"derived query index over a canonical store" pattern (UC-63), at planet scale.
|
||||
- **Identity:** the **opaque Q/P/L IDs are the identity**, fully decoupled from
|
||||
human-readable labels and from language. This is the cleanest real-world instance of
|
||||
**stable, language-neutral identity ≠ label/placement** — a strong reinforcement of our
|
||||
identity model (T16).
|
||||
- **History:** because each entity is one MediaWiki page, history is **page-level MediaWiki
|
||||
revisions** — every edit is a full-entity JSON snapshot with author/timestamp/comment.
|
||||
*Coarse* history granularity (whole entity per revision), but the **edit API is
|
||||
fine-grained** (`wbsetclaim`, `wbeditentity` patch individual statements). So: **fine
|
||||
write API over a coarse history unit** — a distinct point on the write/history spectra.
|
||||
|
||||
## 4. Capability profile
|
||||
|
||||
| Dimension (synthesis spectrum) | Wikibase / Wikidata |
|
||||
|--------------------------------|---------------------|
|
||||
| Attachment mode | **external-API** (MediaWiki Action API + REST) **and** a derived **SPARQL endpoint**; self-hostable |
|
||||
| Addressing granularity | **statement** (each has a GUID) within an **entity** (Q/P id) |
|
||||
| Content identity | **stable opaque ID** (Q/P/L); labels are multilingual annotations |
|
||||
| Identity vs placement | **fully separated** — identity is language- and label-neutral |
|
||||
| Structure | **typed knowledge graph**: entities + statements (claim+qualifiers+refs+rank) |
|
||||
| History | **page-level revisions** (whole-entity JSON snapshots); fine-grained edit API |
|
||||
| Merge model | MediaWiki last-writer / edit-conflict; rank lets contradictory values coexist |
|
||||
| Native query | **SPARQL** (RDF) + **federated `SERVICE`** cross-endpoint join — the far end |
|
||||
| Translation | **not Markdown** — content *is* statements; render to prose is a lossy projection |
|
||||
| Attachment/write granularity | **statement-level writes** via API; coarse history unit |
|
||||
| Operational envelope | huge derived index (Blazegraph), rate-limited public endpoints |
|
||||
| Access grant | open read; MediaWiki user/permission model for write; self-host = own ACL |
|
||||
| Content opacity | transparent (public JSON + RDF); not encrypted |
|
||||
| Provenance | **statement-level** — references + rank per assertion (new far end) |
|
||||
|
||||
## 5. INTENT mapping
|
||||
|
||||
### Reinforcements
|
||||
|
||||
- **Stable identity ≠ placement** (T16): Q/P IDs decoupled from labels/language are the
|
||||
textbook case — adopt the principle that a page's *identity* is an opaque stable handle,
|
||||
display names are annotations.
|
||||
- **Derived index over canonical store** (UC-63): WDQS is exactly a SPARQL index rebuilt
|
||||
from canonical JSON entities via an update stream — validates the projection pattern.
|
||||
- **Union without erasure / chorus**: **rank** lets multiple (even contradictory) statements
|
||||
coexist with a curation signal rather than forcing one truth — the *structured* analogue
|
||||
of fedwiki's chorus (UC-72) and our "view multiple versions" (UC-27).
|
||||
- **Mechanism over policy**: references + rank are *mechanism* for representing disagreement
|
||||
and sourcing; which statement "wins" is left to the consumer/query.
|
||||
|
||||
### Divergences (boundaries / design notes)
|
||||
|
||||
- **Content is not Markdown.** A Wikibase "page" is a set of statements; there is no prose
|
||||
body. This is the **structure far-end**: shard-wiki must either (a) treat such a shard as
|
||||
a **structured/typed shard** projected to a *lossy* Markdown/table rendering (UC-55/UC-73),
|
||||
or (b) model a page whose payload is typed statements (T12). Forcing it into Markdown-first
|
||||
erases the graph — a design-bug if done silently; render-with-provenance instead.
|
||||
- **Provenance granularity is finer than ours.** Our provenance is per-page/per-shard;
|
||||
Wikibase is **per-statement** (references) and even per-value (rank). The page model and
|
||||
coordination journal should *allow* sub-page provenance (UC-75) even if MVP records it per
|
||||
page.
|
||||
- **Query is graph, not text/datalog.** SPARQL over RDF (with federated `SERVICE`) is a
|
||||
richer query far-end than Roam/Logseq datalog or Notion filters (UC-52) — and its
|
||||
`SERVICE` federation is a *query-time* cross-shard join, distinct from fedwiki structural
|
||||
federation. Note both as native-query tiers.
|
||||
|
||||
### What to keep
|
||||
|
||||
1. **Opaque stable identity, labels-as-annotations** as the identity model (T16).
|
||||
2. **Statement/assertion-level provenance** (references) and a **coexistence-with-rank**
|
||||
model as the structured form of union-without-erasure (UC-75).
|
||||
3. **Derived SPARQL/graph index over a canonical entity store** as a projection pattern
|
||||
(UC-63/UC-74), incl. **federated query** as a first-class federation mode.
|
||||
4. A **typed-graph page payload** option in the page model (T12), with **lossy
|
||||
render-to-Markdown** as the projection (never silent flattening).
|
||||
|
||||
## 6. UC seeds
|
||||
|
||||
| # | Seed | Disposition |
|
||||
|---|------|-------------|
|
||||
| UC-73 | Attach a **Wikibase** as a **typed entity-statement (RDF) shard** (items/properties/statements w/ qualifiers); project to a rendered page view, lossy to Markdown, preserving the graph | **new** |
|
||||
| UC-74 | **Graph-query the union** via **SPARQL** and **federate queries across endpoints** (`SERVICE`) — graph query as a native-query tier + query-time cross-shard join | **new** |
|
||||
| UC-75 | Preserve **statement-level provenance** — references + rank attached to each assertion (sub-page provenance granularity) | **new** |
|
||||
| — | typed records → typed *graph* entities | enrich **UC-34** |
|
||||
| — | inter-record relations → typed graph edges with qualifiers | enrich **UC-58** |
|
||||
| — | native query → SPARQL/RDF + federated SERVICE | enrich **UC-52** |
|
||||
| — | provenance → statement/assertion granularity | enrich **UC-24** |
|
||||
|
||||
## 7. Architecture notes for SHARD-WP-0002
|
||||
|
||||
- **T12 (structured/typed page model):** add a **typed-graph payload** tier above
|
||||
typed-records — a page whose content is **entities + statements (claim + qualifiers +
|
||||
references + rank)**, with `somevalue`/`novalue` known-unknowns. Render-to-Markdown is a
|
||||
**lossy projection**, not the canonical form.
|
||||
- **T16 (identity / addressing):** adopt **opaque stable identity with labels-as-annotation**
|
||||
(Q/P model); record **statement GUIDs** as an example of *sub-page addressable units*.
|
||||
- **Native-query tiering:** SPARQL/RDF + federated `SERVICE` is the **graph far-end** of the
|
||||
query spectrum (above datalog/filters); `SERVICE` is also a **query-time federation**
|
||||
mode to sit beside fedwiki's structural federation.
|
||||
- **Provenance model:** allow **per-statement references + rank** (sub-page provenance,
|
||||
coexistence-with-curation) in the union, even if MVP collapses to per-page.
|
||||
- **Derived index:** WDQS = canonical JSON entities → update stream → Blazegraph SPARQL
|
||||
index; the reference implementation of UC-63 at scale (per-shard or core-built index, Q16).
|
||||
|
||||
## 8. Open questions
|
||||
|
||||
1. Does shard-wiki model a **typed-graph page** natively (T12), or always treat Wikibase as
|
||||
a structured shard **projected to a Markdown/table rendering** (UC-55), or both
|
||||
(canonical graph + lossy view)?
|
||||
2. Is **SPARQL/graph query** exposed as a union-level capability (translate to a common
|
||||
query layer) or only as a **pass-through** to graph-capable shards? How does federated
|
||||
`SERVICE` relate to shard-wiki's own cross-shard query?
|
||||
3. At what granularity does the coordination journal record **provenance** — per page
|
||||
(MVP), per statement (Wikibase-native), or configurable?
|
||||
4. Is **rank** (coexisting contradictory values w/ curation) representable in the union as a
|
||||
first-class "chorus of statements," unifying with fedwiki's page-level chorus (UC-72/27)?
|
||||
|
||||
## 9. Sources
|
||||
|
||||
- Wikibase/DataModel and **DataModel/Primer** — mediawiki.org
|
||||
- Help:Qualifiers; Wikidata SPARQL query service + Query Help; SPARQL tutorial — wikidata.org
|
||||
- Wikidata Query Service / User Manual — mediawiki.org; Wikitech (Blazegraph, updater)
|
||||
- "The wikibase model" — Vanderbilt Libraries Digital Lab (heardlibrary.github.io)
|
||||
- RaiseWikibase — Wikibase Data Model functions (ub-mannheim.github.io)
|
||||
- WShEx / EntitySchemas (ShEx) — arxiv.org/abs/2208.02697, ceur-ws.org Vol-3262
|
||||
|
||||
## 10. Traceability
|
||||
|
||||
New UCs **UC-73–UC-75** carry the marker **⬡** in the wikiengines column of
|
||||
`spec/UseCaseCatalog.md`. Enriched: UC-34, UC-58, UC-52, UC-24. Architecture cross-refs:
|
||||
SHARD-WP-0002 T12, T16, native-query tiering, provenance model, UC-63 derived index.
|
||||
@@ -22,9 +22,26 @@ when multiple files or sources are involved. Findings here inform `spec/` and
|
||||
| 2026-06-14 | `260614-roam-deep-dive/` | Roam Research — block-graph DataScript DB, transclusion, datalog, Roam Depot extension API; UC-50/51/52 |
|
||||
| 2026-06-14 | `260614-obsidian-deep-dive/` | Obsidian — file-over-app vaults, plugin API, ecosystem-popularity → UC signal; UC-53/54/55/56 |
|
||||
| 2026-06-14 | `260614-notion-deep-dive/` | Notion — closed block-DB SaaS, external REST API only, database-as-pages; UC-57/58/59 |
|
||||
| 2026-06-14 | `260614-shard-spectrum-synthesis/` | Synthesis — shard family matrix + eleven capability spectra across nine systems; feeds SHARD-WP-0002 T11–T16 |
|
||||
| 2026-06-14 | `260614-shard-spectrum-synthesis/` | Synthesis (v3) — shard family matrix + **fourteen** capability spectra across ~23 systems + **federation-model taxonomy** (fork+journal / VCS-replication+ping / query-join / engine-mirror); feeds SHARD-WP-0002 T1–T6 + T11–T16 |
|
||||
| 2026-06-14 | `260614-joplin-deep-dive/` | Joplin — SQLite-local/Markdown-on-sync, interchange-format attach, E2EE content opacity; UC-60/61 |
|
||||
| 2026-06-14 | `260614-logseq-deep-dive/` | Logseq — block-graph on plain Markdown files, in-file block IDs, derived Datalog index; UC-62/63 |
|
||||
| 2026-06-14 | `260614-localfirst-workspaces-deep-dive/` | Anytype · AFFiNE · AppFlowy — CRDT local-first workspaces (any-sync/Yjs/Yrs), native merge, P2P/E2EE; UC-64/65 |
|
||||
| 2026-06-14 | `260614-trilium-deep-dive/` | Trilium/TriliumNext — note cloning (DAG hierarchy), attribute inheritance/templates, HTML-native, scripting+ETAPI; UC-66/67 |
|
||||
| 2026-06-14 | `260614-wikijs-deep-dive/` | Wiki.js — storage-module engine (DB↔Git Markdown), GraphQL API, pluggable modules ≈ adapter-contract prior art; UC-68/69 |
|
||||
| 2026-06-14 | `260614-wikijs-deep-dive/` | Wiki.js — storage-module engine (DB↔Git Markdown), GraphQL API, pluggable modules ≈ adapter-contract prior art; UC-68/69 |
|
||||
| 2026-06-14 | `260614-federated-wiki-deep-dive/` | Federated Wiki — fork-with-provenance, per-page semantic-action journal (story=replay), neighborhood/roster + chorus; prior art for our coordination journal / overlay / union pillars; UC-70/71/72 |
|
||||
| 2026-06-14 | `260614-wikibase-deep-dive/` | Wikibase/Wikidata — typed entity-statement knowledge graph (claim+qualifiers+refs+rank), SPARQL/RDF + federated SERVICE, opaque stable IDs, statement-level provenance; structure & query far-end; UC-73/74/75 |
|
||||
| 2026-06-14 | `260614-forge-wikis-deep-dive/` | Gitea · GitLab · GitHub wikis — a wiki is a separate `.wiki.git` of Markdown; git-clone universal, wiki API capability-varying (GitHub git-only); git IS the store (resolves UC-68 race); the home case; UC-76/77 |
|
||||
| 2026-06-14 | `260614-tiddlywiki-deep-dive/` | TiddlyWiki — entire wiki (content + app) in one self-contained HTML file = whole-file write-granularity extreme; Node `.tid` file-per-tiddler substrate (git-diffable); tiddler/field records, filter-expression query; UC-78 |
|
||||
| 2026-06-14 | `260614-ikiwiki-deep-dive/` | ikiwiki — wiki compiler: git-canonical Markdown source → static HTML (derived publish/projection); git-distributed clone federation + XML-RPC pinger (third federation flavor); UC-79 |
|
||||
| 2026-06-14 | `260614-quip-deep-dive/` | Salesforce Quip — closed-SaaS live docs + embedded spreadsheets/live apps; REST + lossy HTML import/export; Salesforce enterprise ACL; external-API payload-format facet + inline-object page model; UC-80 |
|
||||
| 2026-06-14 | `260614-mojomojo-deep-dive/` | MojoMojo — Perl Catalyst/DBIx::Class DB-backed wiki; pages + history in SQL tables, no file store/API → direct-DB-read binding; DB version rows as history source; UC-81 |
|
||||
| 2026-06-14 | `260614-oddmuse-deep-dive/` | Oddmuse — single Perl CGI, plain-text `page/` files + `keep/` revisions, no DB; the minimal file-store floor / graceful-degradation baseline; partial-history honesty; UC-82 |
|
||||
| 2026-06-14 | `260614-usemodwiki-deep-dive/` | UseModWiki — flat-file ancestor (Wikipedia's MediaWiki Phase I); CamelCase linking; lineage grounding for the minimal file-store floor; enrichment-only (reinforces UC-82, UC-25) |
|
||||
| 2026-06-14 | `260614-literate-programming-deep-dive/` | Literate programming (Knuth's WEB / weave / tangle) — one source → N co-equal derived projections (docs + code); named-chunk transclusion; splits replication- vs derivation-projection; SHARD-WP-0004 T1; UC-83 |
|
||||
| 2026-06-14 | `260614-jupyter-deep-dive/` | Jupyter Notebooks — `.ipynb` JSON cells + embedded computed outputs with fragile execution provenance; derived output stored *inside* the source; non-Markdown/lossy; kernel = capability; SHARD-WP-0004 T3; UC-84 |
|
||||
| 2026-06-14 | `260614-glamorous-toolkit-deep-dive/` | Glamorous Toolkit (moldable development on Pharo) — `gtView` open set of co-equal type-keyed computed views (none canonical) = moldable view registry; Lepiter live notebook over git files; SHARD-WP-0004 T7; enrichment-only (UC-47/48/54) |
|
||||
| 2026-06-14 | `260614-mathematica-deep-dive/` | Mathematica Notebooks — the original computational notebook (`.nb` = a Wolfram expression); nestable cell groups, structured re-evaluable outputs, `Manipulate` live widgets, CDF; confirms UC-84 notebook shape is a genus; SHARD-WP-0004 T2; enrichment-only (reinforces UC-84; UC-54/55) |
|
||||
| 2026-06-14 | `260614-squeak-pharo-deep-dive/` | Squeak & Pharo (image-based Smalltalk) — the live-object image (purest "live" end); image-is-not-a-store boundary (export→files only); Pharo Tonel/Iceberg externalizes code to git text; names the live↔snapshot projection axis; SHARD-WP-0004 **T6 + T8** (merged); boundary/enrichment-only, no new UC |
|
||||
| 2026-06-14 | `260614-processing-deep-dive/` | Processing / p5.js — program-as-page rendered live at view time (no cached output); adds materialization-timing + continuity facets to projection; execute-in-viewer = capability+trust; SHARD-WP-0004 T4; enrichment-only (UC-54/55) |
|
||||
| 2026-06-14 | `260614-strudel-deep-dive/` | Strudel.cc (TidalCycles JS) live-coding REPL — code as live time-based audio performance; the far live end (no faithful static form; static = source + marked recording); honesty test for graceful degradation; SHARD-WP-0004 T5; enrichment-only (UC-54/55) |
|
||||
| 2026-06-15 | `260614-computational-page-model-synthesis/` | **SHARD-WP-0004 synthesis** — *the computational page model*: source canonical / everything rendered is a projection; two axes (replication↔derivation, live↔snapshot); four computational page shapes; recommendation — executable content in scope as page-model+projection, out of scope as an execution platform (no INTENT amendment); feeds SHARD-WP-0002 T11–T16 |
|
||||
976
spec/CoreArchitectureBlueprint.md
Normal file
976
spec/CoreArchitectureBlueprint.md
Normal file
@@ -0,0 +1,976 @@
|
||||
# CoreArchitectureBlueprint — shard-wiki
|
||||
|
||||
Status: **draft for review** · Date: 2026-06-15 · Owner: tegwick
|
||||
|
||||
The whole-system architecture for shard-wiki, synthesised from `INTENT.md`, the 84-entry
|
||||
`UseCaseCatalog.md`, and the full research arc (`research/260608-*`, `research/260613-*`,
|
||||
`research/260614-*` — ~23 wiki/knowledge systems plus two cross-dive syntheses). This is the
|
||||
**core** blueprint: it defines the layers, the abstractions, and the load-bearing decisions
|
||||
that everything else implements.
|
||||
|
||||
Scope relationship to the other specs:
|
||||
|
||||
- **`ArchitectureBlueprint.md`** (existing) is the **authorization & history sub-blueprint**
|
||||
(the L0–L4 ladder). This document references it as the design of the cross-cutting
|
||||
authorization layer (§9) and does not restate it.
|
||||
- **`SHARD-WP-0002`** is the workplan that turns §6–§8 into
|
||||
`spec/FederationArchitecture.md` + the adapter-contract section of
|
||||
`spec/TechnicalSpecificationDocument.md`.
|
||||
- **`UseCaseCatalog.md`** is the demand this architecture must satisfy; UC references below
|
||||
are load tests, not decoration.
|
||||
- **`WikiEngineCoreArchitecture.md`** designs shard-wiki's native, headless, API-first wiki
|
||||
engine as a **canonical-mode shard backend** (one shard behind §6/§A — federation, union, and
|
||||
projection stay here in the orchestrator, not in the engine). Added per the 2026-06-15 INTENT
|
||||
amendment (decision `84ffdb48`, SHARD-WP-0013).
|
||||
|
||||
---
|
||||
|
||||
## 1. The thesis: *canonical vs derived* (three states)
|
||||
|
||||
Everything in shard-wiki follows from one organising decision — that state comes in exactly
|
||||
**three kinds**, and only one of them is disposable:
|
||||
|
||||
> **1. Sharded-canonical** — content owned by each shard (shard sovereignty).
|
||||
> **2. Coordination-canonical** — durable state *born inside shard-wiki* that encodes human
|
||||
> or cross-shard decisions and exists nowhere else: overlays (the local truth against a
|
||||
> read-only shard), curator equivalence bindings, alias tables, merge/reconciliation
|
||||
> decisions. It is recorded as an **append-only decision log in the Git coordination
|
||||
> journal** (event-sourced, §8.1); the *queryable current form* of that state (the effective
|
||||
> alias table, the equivalence set) is a **derived fold** of the log — i.e. tier 3, not tier 2.
|
||||
> What is canonical is the **log of decisions**, not any mutable snapshot of them.
|
||||
> **3. Derived-disposable** — everything shard-wiki *computes* from (1)+(2): the union graph,
|
||||
> equivalence index, query indexes, projections, views. It can be deleted and recomputed.
|
||||
>
|
||||
> **Canonical = sharded ∪ coordination. Derived = a pure function of canonical:**
|
||||
> `derived = f(sharded, coordination)`.
|
||||
|
||||
This is the architectural form of "orchestrator, not engine." shard-wiki never *becomes* the
|
||||
source of truth; it composes sources and records the decisions it makes about them. The
|
||||
research earned the *files-canonical* half empirically — every serious system externalises its
|
||||
durable truth to files+VCS and treats the rest as derived: Logseq (DataScript index over plain
|
||||
files), ikiwiki (static HTML compiled from a git repo), Glamorous Toolkit / Lepiter (live
|
||||
views over git-versioned JSON), Pharo (Tonel/Iceberg code as git text), Jupyter teams
|
||||
(nbstripout — outputs are derived noise). The one tradition that refuses this — the Smalltalk
|
||||
**image** — is exactly the one we record as a *boundary, not a backend*
|
||||
(`research/260614-squeak-pharo-deep-dive`).
|
||||
|
||||
The earlier draft of this blueprint used a two-bucket framing ("canonical at the edges,
|
||||
derived in the middle"). That was wrong by omission: it had no home for **coordination-
|
||||
canonical** state, and so contradicted itself by listing curator bindings and alias tables as
|
||||
"derived/rebuildable" when a human binding manifestly cannot be rebuilt. The three-state model
|
||||
fixes that crack (review finding A-1) and makes `derived = f(canonical)` *literally* true.
|
||||
|
||||
Three consequences fall straight out, and they are the spine of the rest of this document:
|
||||
|
||||
1. **Graceful degradation is free.** If the derived tier is always recomputable, a backend
|
||||
that can only be read is still a first-class participant — you just derive less from it.
|
||||
2. **Provenance is tractable.** Because shard-wiki never claims to *be* the source, every
|
||||
derived artifact can always point back to the canonical input it came from (union without
|
||||
erasure is a structural property, not a feature bolted on).
|
||||
3. **The derived tier is a pure function of canonical state.** `derived = f(sharded,
|
||||
coordination)`. Bugs in the derived tier are recoverable by recompute; only the two
|
||||
canonical tiers must be durably protected — sharded by each shard, coordination by the
|
||||
Git journal (history). *Recomputability is a correctness property of the derived tier, not
|
||||
a promise that a from-scratch rebuild is operationally cheap — see §8.4 and the
|
||||
operational-envelope axis.*
|
||||
|
||||
### The dual narrow waist
|
||||
|
||||
Heterogeneity is mediated at exactly two interfaces, and nowhere else:
|
||||
|
||||
- **Bottom waist — the Shard Adapter Contract (§6).** Every backend, however weird, enters
|
||||
through one versioned, capability-described interface.
|
||||
- **Top waist — the Wiki Page Model (§7).** Every consumer, however demanding, sees one
|
||||
backend-neutral, Markdown-first-but-stretchable page model.
|
||||
|
||||
Between the waists, core logic is written **once** against capabilities and the page model —
|
||||
never against a specific backend. Adding TiddlyWiki or Notion or a git forge is writing an
|
||||
adapter and declaring a capability profile, not editing core algorithms.
|
||||
|
||||
---
|
||||
|
||||
## 2. Architectural invariants
|
||||
|
||||
These are non-negotiable. Violating one is a design bug, not a tradeoff. They are INTENT's
|
||||
principles fused with the research through-lines.
|
||||
|
||||
| # | Invariant | Source |
|
||||
|---|-----------|--------|
|
||||
| I-1 | **Orchestrator, not engine.** Core composes shards; it never replaces or homogenises them. | INTENT Stability Note |
|
||||
| I-2 | **Three states; derived = f(canonical).** State is sharded-canonical, coordination-canonical (journal), or derived-disposable. The derived tier (union/index/projection) is a pure, recomputable function of the two canonical tiers; only canonical state is durably protected. | §1; Logseq/ikiwiki/GT through-line |
|
||||
| I-3 | **Capability-awareness is data.** A binding's abilities are a *profile* (positions on spectra), read by generic core logic — not per-backend branches. | synthesis v3 §2; INTENT capability-aware adapters |
|
||||
| I-4 | **Union without erasure.** Every page/revision/projection/overlay/view carries its provenance, freshness, liveness, and divergence. | INTENT; provenance-granularity spectrum (Wikibase) |
|
||||
| I-5 | **Overlay before mutation.** Writes to anything below write-through land as drafts/patches/MRs first; no silent remote mutation. | INTENT |
|
||||
| I-6 | **Git-addressable coordination.** Every information space has a Git-backed journal even when its shards are not git-native. | INTENT |
|
||||
| I-7 | **Mechanism over policy.** Canonical-source, conflict, editorial, sync cadence are configurable presets, never hard-coded. | INTENT |
|
||||
| I-8 | **Graceful degradation.** A limited backend is still usable as read-only / cache / projection / backup / patch target. | INTENT |
|
||||
| I-9 | **Identity ≠ placement.** A page is an entity that may occupy N locations; address by identity, not by path. | Trilium note/branch; ZigZag |
|
||||
| I-10 | **History is the floor.** Every write is a recoverable commit; recoverability, not gatekeeping, is the baseline protection. | ArchitectureBlueprint §2 |
|
||||
| I-11 | **Authorization in core, authentication delegated.** Core decides who-may; an external provider says who-is. | INTENT; ArchitectureBlueprint |
|
||||
| I-12 | **Not a file-sync daemon; not an execution platform.** Sync is wiki-page-semantic; computation is recognised+projected, not hosted. | INTENT; computational-page-model synthesis |
|
||||
| I-13 | **Tenant-partitioned derived state.** Derived state is partitioned by tenant/root entity; no derived artifact spans tenants except via explicit, authorised cross-root federation. | §9.1; review B-3 |
|
||||
|
||||
---
|
||||
|
||||
## 3. The layered architecture
|
||||
|
||||
```
|
||||
┌───────────────────────────────────────────────────────────────┐
|
||||
│ L6 Consumers — Orchestrator API · CLI/agents · Web/Obsidian │
|
||||
├───────────────────────────────────────────────────────────────┤
|
||||
X-cut │ L5 Authorization (PEP/PDP, identity-provider iface) → │ X-cut
|
||||
Prove- │ see ArchitectureBlueprint.md (L0–L4 ladder) │ Capa-
|
||||
nance ├───────────────────────────────────────────────────────────────┤ bility
|
||||
▲ │ L4 Union & Projection (DERIVED · rebuild=fallback) │ ▲
|
||||
│ │ identity resolution · equivalence/chorus · union graph · │ │
|
||||
│ │ replication+derivation projections · moldable view registry│ │
|
||||
│ │ · derived query index │ │
|
||||
│ ├───────────────────────────────────────────────────────────────┤ │
|
||||
│ │ L3 Coordination (Git journal · overlay/patch engine · │ │
|
||||
│ │ federation-model strategies · reconciliation) │ │
|
||||
│ ├───────────────────────────────────────────────────────────────┤ │
|
||||
│ │ L2 Wiki Page Model ── TOP WAIST ── │ │
|
||||
│ │ backend-neutral pages · identity≠placement · span address ·│ │
|
||||
│ │ provenance envelope · the page shapes │ │
|
||||
│ ├───────────────────────────────────────────────────────────────┤ │
|
||||
│ │ L1 Shard Adapter Contract ── BOTTOM WAIST ── │ │
|
||||
│ │ versioned iface · capability profile (orthogonal) · │ │
|
||||
│ │ attachment-mode binding · operation verbs │ │
|
||||
└──── ├───────────────────────────────────────────────────────────────┤ ──┘
|
||||
│ L0 Backends (not ours): git repos, wiki/ subdirs, Gitea/ │
|
||||
│ GitLab/GitHub wikis, folders, Obsidian, WebDAV, Notion, │
|
||||
│ Coulomb spaces, notebooks, … │
|
||||
└───────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
**Provenance** and **Capability** are drawn as vertical rails because they are not layers —
|
||||
they are present at every layer. A page object at L2 carries provenance; a projection at L4
|
||||
carries provenance; an authz decision at L5 records the context under which content was read.
|
||||
Likewise a capability profile declared at L1 is consulted at L3 (can we write-through?), L4
|
||||
(can we delegate a query?), and L5 (can this principal even reach the op?).
|
||||
|
||||
The dependency rule is strict and downward, and it tracks the **three states (§1)**, not the
|
||||
layer numbers: **the derived-disposable tier (the whole of L4) may be deleted and recomputed
|
||||
from canonical state (sharded content at L1 + coordination-canonical state in the L3 journal).**
|
||||
Nothing canonical may depend on derived state. Note the journal at L3 is *canonical* (it holds
|
||||
overlays, bindings, aliases, merges); only L4 is disposable.
|
||||
|
||||
---
|
||||
|
||||
## 4. Core abstractions (the vocabulary code must use)
|
||||
|
||||
Straight from INTENT, sharpened by research. New code maps onto these; it does not invent
|
||||
parallel terms.
|
||||
|
||||
- **Shard** — an independently meaningful page store attached to a root entity, with
|
||||
*sovereignty*: its own backend, capability profile, history, identity model, limits.
|
||||
- **Root entity / information space** — the joined space shards attach to; the unit of
|
||||
Git coordination and of multi-tenancy (a tenant maps to a root entity, ArchitectureBlueprint).
|
||||
- **Shard adapter contract** — the versioned L1 interface; the bottom waist.
|
||||
- **Capability profile** — a shard binding's position on each of the 15 spectra (§6) plus its
|
||||
supported verbs. *The* data structure that drives degradation.
|
||||
- **Wiki page model** — the L2 backend-neutral page; the top waist.
|
||||
- **Page identity vs placement vs equivalence** — a page is an entity with a *stable handle*
|
||||
(identity); it may have N placements (paths/shards); **addressing and transclusion key on
|
||||
identity, but equivalence keys on content fingerprint *across* identities** (§7.2, I-9). The
|
||||
three are distinct mechanisms — never conflate identity with a fingerprint.
|
||||
- **Provenance envelope** — the metadata each artifact carries (source shard, freshness,
|
||||
liveness, authz context, overlay status, divergence, lineage), stored **layered**: a
|
||||
page-level envelope + span-level *deltas*, so per-span cost is near-zero when uniform (§7.3).
|
||||
- **Coordination journal** — the L3 Git-backed, **append-only decision log** for a space: the
|
||||
durable home of all **coordination-canonical** state (§1, §8.1) as *events* (overlay-created,
|
||||
binding-made, alias-set, merge-decided), plus the content change-flow record. It is event-
|
||||
sourced — committed, never overwritten; the queryable current coordination state is a derived
|
||||
fold of it (§8.1).
|
||||
- **Overlay** — a non-destructive local edit against a remote/read-only/limited shard,
|
||||
representable as draft/patch/commit/MR before destructive apply. Coordination-canonical: an
|
||||
unapplied overlay is the local truth and lives in the journal.
|
||||
- **Projection** — a derived view of shard content. The default is a **plain lazy
|
||||
replication-projection** (a freshness-stamped cache); only *source* content needing
|
||||
transform/evaluate uses the **derivation-projection** extension point with its two-axis
|
||||
typing (kind × liveness) and the moldable view registry (§8.4–§8.5).
|
||||
- **Federation model** — the selected coordination strategy for a space (§ taxonomy, T17).
|
||||
- **Shard mode** — read-only · write-through · mirrored · projected · cached · canonical
|
||||
(a *policy* selection constrained by the capability profile).
|
||||
|
||||
---
|
||||
|
||||
## 5. Why "layered" and not "pipeline" or "plugin-bus"
|
||||
|
||||
Two rejected alternatives, recorded so the choice is legible:
|
||||
|
||||
- **A sync pipeline** (source → transform → sink) was rejected: it implies a privileged
|
||||
direction and a canonical sink, which violates shard sovereignty (I-1) and union-without-
|
||||
erasure (I-4). shard-wiki is a *star* (many shards ↔ one space), not a pipe.
|
||||
- **A flat plugin bus** (every backend a peer plugin emitting events) was rejected as the
|
||||
*top-level* shape: it has no narrow waist, so heterogeneity leaks into every consumer.
|
||||
We keep the plugin idea but confine it to L1 (adapters) and L3 (federation strategies),
|
||||
behind the waists.
|
||||
|
||||
The layered-with-rails shape is what makes I-2/I-3/I-4 hold simultaneously.
|
||||
|
||||
---
|
||||
|
||||
## 6. Bottom waist — the Shard Adapter Contract (L1)
|
||||
|
||||
The single most important design decision in the project: **the adapter contract models
|
||||
positions on capability spectra, not a flat checklist of boolean verbs.** A backend is not
|
||||
"can/can't merge"; it sits *somewhere* on the merge spectrum, and federation operations
|
||||
degrade by position. This is the lesson of putting ~23 systems in one matrix
|
||||
(`research/260614-shard-spectrum-synthesis`, v3).
|
||||
|
||||
### 6.1 The fifteen capability spectra
|
||||
|
||||
Each binding declares a position on each axis. Core algorithms read these positions; there is
|
||||
no per-backend code in core (I-3).
|
||||
|
||||
1. **Addressing granularity** — none → path → page-level store-id → in-file span → in-file
|
||||
block id (Logseq `id::`) → store-UUID → portable tumbler (Xanadu, the unreached ideal)
|
||||
2. **Content identity** — none → path/title → fingerprint → span-set
|
||||
3. **Identity vs placement** — path=identity → identity separated from placement (Trilium
|
||||
note/branch = a DAG)
|
||||
4. **Structure** — flat MD → frontmatter/`key::` → `%META%` → typed objects → DB schema+
|
||||
relations → object-graph/ontology → computed (inherited+templated) → typed-graph statements
|
||||
5. **History** — none → internal-only / CRDT-log → open-file → git-native
|
||||
6. **Merge model** — none → git/text → conflict-notes/keep-both → native-CRDT → coexist-with-rank
|
||||
7. **Native query** — none → text → build-your-own derived index → datalog/graph → DB query → SPARQL
|
||||
8. **Translation** — native → lossless → lossy-with-fidelity-report (incl. HTML)
|
||||
9. **Attachment mode** — file-store (native | interchange-mirror) → git-IS-store → in-engine-host
|
||||
→ local-REST → external-API → direct-DB → CRDT-replica → P2P/no-central-endpoint
|
||||
10. **Operational envelope** — local/unbounded → realtime CRDT/WebSocket → rate-limited/
|
||||
eventually-consistent/paginated
|
||||
11. **Access grant** — open → token → OAuth scoped+revocable → P2P key/invite → enterprise ACL
|
||||
12. **Content opacity** — plaintext → structured re-evaluable value → encrypted whole-shard →
|
||||
per-item → proprietary-lossy-exportable
|
||||
13. **Write granularity** — whole-file (TiddlyWiki) → per-page → section/anchor → per-block → story-item
|
||||
14. **Provenance granularity** — per-shard → per-page → per-edit → per-statement/value (Wikibase rank+refs)
|
||||
15. **Computational / liveness** — static source → captured-output snapshot → live-over-files →
|
||||
view-time render → irreducibly-live/temporal
|
||||
|
||||
### 6.2 Operation verbs
|
||||
|
||||
`read, write, diff, merge, lock, version, publish, notify, transclude-source,
|
||||
translate-syntax, structured-payload, derive-projection, execute/evaluate`. The last two are
|
||||
**gated, off by default** (§8, computational content). Verb support is part of the profile and
|
||||
must reconcile with the federation-ops capability matrix (SHARD-WP-0002 T10).
|
||||
|
||||
### 6.3 Attachment-mode taxonomy (axis 9, expanded)
|
||||
|
||||
A backend may offer **several** modes; attach mode is a **per-binding, capability-gated
|
||||
choice**, with one declared authoritative. Modes: file-store (native vault/folder *or* an
|
||||
interchange/sync mirror), **git-IS-store** (the home case — forge wikis & ikiwiki: git is the
|
||||
store *and* the journal at once, resolving the engine-mirror write-race), in-engine hosted
|
||||
adapter (XWiki component, Obsidian/Logseq/Roam plugin, Trilium script), local-REST (Joplin
|
||||
Data API, Trilium ETAPI), external-API-only (Notion), direct-DB (MojoMojo schema→model),
|
||||
CRDT-replica (Anytype/AFFiNE/AppFlowy), P2P/no-central-endpoint. **Boundary:** a monolithic
|
||||
live-memory blob (Smalltalk image, a kernel) is **never** an attach target — it participates
|
||||
only via export→files (I-12).
|
||||
|
||||
### 6.4 Contract rules
|
||||
|
||||
- **Versioned interface** (Foswiki::Store + Foswiki::Meta is the proof that a stable
|
||||
store-interface-with-swappable-backends works). Capability discovery is a static profile
|
||||
with optional runtime negotiation.
|
||||
- **Backend-swap tolerance** — shard identity/provenance survives a substrate change
|
||||
(RCS↔PlainFile, folder→Git, Logseq file→SQLite): bind to *capabilities*, not to "it's files."
|
||||
- **Absence is first-class** — the profile must express *can't* cleanly (Oddmuse floor), so
|
||||
degradation paths are explicit, never guessed.
|
||||
|
||||
### 6.5 Orthogonal core, implied positions, and the interaction subset
|
||||
|
||||
Fifteen independent ordinal axes is *descriptively* right but would be *operationally* a mess
|
||||
if treated as fifteen free dimensions: the axes are **not orthogonal**, and a degradation
|
||||
function over all 15 jointly is the flat-checklist problem returning in higher dimensions
|
||||
(review D-1). Three rules tame it.
|
||||
|
||||
**(a) A small orthogonal core; the rest are implied.** Most axes are *correlated* and collapse
|
||||
to a few independent choices. The **core axes** an adapter must independently declare:
|
||||
|
||||
1. **Substrate** → drives attachment-mode, history, merge, and native-query positions together
|
||||
(git-IS-store ⟹ history=git-native ⟹ merge=git/text ⟹ query=build-your-own-index;
|
||||
relational-DB ⟹ direct-DB attach ⟹ DB-version-row history ⟹ DB query).
|
||||
2. **Write granularity** → drives addressing granularity and the overlay/patch shape.
|
||||
3. **Content opacity** → drives translation and (where encrypted) collapses native-query.
|
||||
4. **Operational envelope** → drives freshness mode (§8.8) and rebuild expectations (§8.7).
|
||||
5. **Access grant** → independent (authz, L5).
|
||||
6. **Computational/liveness** → independent (projection kind, §8.5).
|
||||
|
||||
The remaining axes are **implied/derived** from these via published implication rules; an
|
||||
adapter *may* override an implied position, but the default is computed, not hand-set. This
|
||||
turns ~15 free dimensions into ~6 independent ones plus derivations — fewer things to get
|
||||
wrong, and impossible combinations become unrepresentable.
|
||||
|
||||
**(b) Implication rules forbid impossible profiles.** E.g. `attachment=git-IS-store ⟹
|
||||
history≥git-native`; `opacity=encrypted-whole-shard ⟹ native-query=none ∧ translation≤opaque`;
|
||||
`merge=native-CRDT ⟹ history=CRDT-log ∧ envelope=realtime`. A profile that violates an
|
||||
implication is rejected at registration — capability-as-data (I-3) with integrity constraints.
|
||||
|
||||
**(c) The degradation function reads a *named, small* interaction subset — not all pairs.**
|
||||
"No per-backend code" is only credible if we say *which* axis interactions the generic logic
|
||||
actually consults. They are:
|
||||
|
||||
| Operation | Axes consulted (jointly) |
|
||||
|-----------|--------------------------|
|
||||
| **write / overlay-apply** | write-granularity × merge-model × history × access-grant |
|
||||
| **transclude / address a span** | addressing-granularity × write-granularity × identity-vs-placement |
|
||||
| **project / cache** | operational-envelope × computational-liveness × content-opacity |
|
||||
| **query** | native-query × content-opacity (encrypted ⇒ derive-index-or-none) |
|
||||
| **translate** | translation × content-opacity × structure |
|
||||
| **federate** | substrate × history × merge (per the §8.3 model) |
|
||||
|
||||
Everything else is a single-axis check. This table *is* the degradation contract: it is small,
|
||||
enumerated, and testable — the proof obligation behind "core logic written once."
|
||||
|
||||
### 6.6 Conformance — profiles are verified, never self-asserted
|
||||
|
||||
Capability-as-data (I-3) and the entire degradation contract (§6.5) rest on one assumption:
|
||||
**the profile tells the truth.** If an adapter declares `merge=git/text` but corrupts merges,
|
||||
or claims `notify` and never emits, it silently poisons every degradation decision in core —
|
||||
the failure is invisible because core *believed the data* (review B-2). So the profile is not
|
||||
taken on trust:
|
||||
|
||||
- **The contract ships a versioned conformance suite.** A published battery that, given a live
|
||||
binding, **exercises each declared verb and each declared spectrum position and checks that
|
||||
observed behaviour matches the claim** (a `write` round-trips; a `diff` is real; `notify`
|
||||
actually fires; an "encrypted/opaque" shard genuinely refuses plaintext query; an
|
||||
implication-rule position, §6.5(b), holds). The suite is versioned *with* the contract, so an
|
||||
adapter proves conformance against a known contract version.
|
||||
- **Passing conformance is an admissibility precondition.** A binding that fails (declares a
|
||||
capability it does not honour) is **rejected at registration**, not run in production with a
|
||||
lying profile. Capability discovery (§6.4) therefore yields a *verified* profile.
|
||||
- **Self-reported, then verified.** Adapters still *declare* their profile (discovery stays
|
||||
cheap); conformance *verifies* the declaration. The two together are what make I-3 and §6.5
|
||||
sound rather than aspirational — degradation logic acts on verified data.
|
||||
- **Mismatch is data, not a crash.** A conformance gap is reported as a precise
|
||||
capability-by-capability diff (what was claimed vs observed), so an adapter author fixes the
|
||||
profile or the code; degraded-but-honest registration (drop the unsupported claim) is allowed.
|
||||
|
||||
This is the same discipline a versioned store interface needs in general (the `Foswiki::Store`
|
||||
lineage that inspired the contract): a backend may only participate behind the interface if it
|
||||
*demonstrably* behaves as the interface says.
|
||||
|
||||
---
|
||||
|
||||
## 7. Top waist — the Wiki Page Model (L2)
|
||||
|
||||
Backend-neutral, **Markdown-first but stretchable many ways at once**. The page model is the
|
||||
lingua franca every consumer sees; an adapter's job is to project its backend into this model
|
||||
(read) and accept overlays back (write), within its capabilities.
|
||||
|
||||
### 7.1 Page shapes the model must carry
|
||||
|
||||
- **Prose Markdown** — the baseline.
|
||||
- **Typed / computed records** — frontmatter/`%META%`/XObjects/Notion DB rows; **computed
|
||||
metadata** (Trilium inherited+templated) represented as *effective-vs-own with per-attribute
|
||||
provenance*.
|
||||
- **Typed-graph statements** — Wikibase claim + qualifiers + references + rank (structure
|
||||
far-end).
|
||||
- **Inline-embedded objects** — Quip/Notion spreadsheets & live apps inside prose.
|
||||
- **Non-Markdown assets** — drawings, canvases, images: typed asset / opaque blob / pluggable
|
||||
content-type registry, never silent-flattened.
|
||||
- **The four computational shapes** (§8): one-source-many-projections, notebook (embedded
|
||||
computed output), program-as-page, live/temporal.
|
||||
|
||||
All shapes reduce to a common skeleton: **`(content | source, structure, provenance envelope,
|
||||
[derivation rule])`**. The page model stores the richest faithful form as canonical and treats
|
||||
any Markdown rendering of a non-Markdown shape as a *lossy projection* (I-4 + fidelity report).
|
||||
|
||||
### 7.2 Identity, placement, addressing — three distinct concepts
|
||||
|
||||
The earlier draft used "identity" for two different things and (worse) suggested deriving page
|
||||
identity from a content fingerprint — which would make *editing a page change its identity* and
|
||||
break every reference to it (review bug B-1). They are pulled apart here:
|
||||
|
||||
- **Page identity — a *stable handle*.** A shard-scoped, durable key that **survives edits**:
|
||||
the backend's native page/note id where one exists (Roam/Notion/Trilium uid, a git path
|
||||
treated as a name, a wiki page name), wrapped in a shard scope so it survives projection and
|
||||
never collides across shards. Identity is *assigned/minted, not computed from content*.
|
||||
References, placement, transclusion targets, and overlays all key on identity.
|
||||
- **Placement — *where* an identity sits.** One identity → N placements (paths/shards) = a DAG;
|
||||
no single canonical path (I-9). Placement can change without changing identity.
|
||||
- **Content equivalence — *detecting sameness*, never identity.** A **content fingerprint** (or
|
||||
span-set overlap) identifies a *version / a piece of content*, used to detect that two
|
||||
*distinct identities* hold the same or derived content (the equivalence/chorus mechanism,
|
||||
§8.4). A fingerprint is never a page's identity: same page, edited → new fingerprint, **same
|
||||
identity**; two pages, identical content → same fingerprint, **different identities**.
|
||||
- **Span addressing** — a sub-page address within an identity: adopt native span IDs where
|
||||
minted (Roam `:block/uid`, Logseq `id::`, Notion/CRDT UUID); else a *position* address
|
||||
(path+range) or a *content-fingerprint* address for equivalence/transclusion. The Xanadu
|
||||
tumbler is the portable ideal the scheme aims at without requiring.
|
||||
- **Provenance envelope** rides on pages and spans (see §7.3 for its layered, low-cost form).
|
||||
|
||||
So the chain is: **identity (stable) → placements (N, mutable) → equivalence (cross-identity
|
||||
sameness, fingerprint-based)** — three concepts, three mechanisms, never conflated.
|
||||
|
||||
### 7.3 Provenance is layered, not per-span-duplicated
|
||||
|
||||
A provenance envelope on *every span* (source shard, freshness, liveness, overlay status,
|
||||
authz context, divergence, lineage) would, at block granularity, mean ~10k near-identical
|
||||
envelopes for a 10k-block page — provenance dwarfing content (review D-2). The fix is the exact
|
||||
pattern the page model already uses for Trilium's computed metadata: **effective-vs-own**.
|
||||
|
||||
- **Page-level envelope** holds the values that are uniform across the page (almost always:
|
||||
source shard, observed-at, liveness, authz context).
|
||||
- **Span-level deltas** record *only where a span differs* from its page envelope — a
|
||||
transcluded span from another shard, an overlaid span, a span that diverges. A span with no
|
||||
delta inherits the page envelope at zero storage cost.
|
||||
- **Effective provenance** for any span = page envelope ⊕ span delta, computed on read.
|
||||
|
||||
Per-span cost is therefore **near-zero in the common (uniform) case** and pays only for genuine
|
||||
heterogeneity — the same "carry only the difference" principle, applied to shard-wiki's own
|
||||
metadata. Provenance remains complete (I-4); it is just not redundantly materialised.
|
||||
|
||||
---
|
||||
|
||||
## 8. Coordination, federation & projection
|
||||
|
||||
### 8.1 Coordination journal (L3) — Git as the spine
|
||||
|
||||
Every information space has a Git-backed coordination journal (I-6). It records cross-shard
|
||||
operations (fork, import, reconcile, overlay-apply, space-branch) and **is** the history floor
|
||||
(I-10). For git-IS-store shards the shard's own git log *is* this journal; for non-git shards
|
||||
the journal supplements (begins-now / mirrors-forward / snapshots-replica) or imports
|
||||
(backfill open file history). History portability is a spectrum, handled per profile (axis 5).
|
||||
|
||||
**The journal is an append-only decision log; current coordination state is a derived fold
|
||||
(review B-3).** The first draft said coordination-canonical state "lives in the journal"
|
||||
without saying how Git — excellent for history, poor for mutable structured state — represents
|
||||
an alias table or an equivalence graph. Resolution: **event sourcing.** The journal stores
|
||||
*decisions as events* (`overlay-created`, `binding-made`, `alias-set`, `merge-decided`,
|
||||
`page-forked`), append-only and git-addressable (so history/patch/review/backup over
|
||||
coordination state come for free — I-6 is *strengthened*, not bypassed). The **queryable
|
||||
current state** (the effective alias table, the live equivalence set) is a **derived fold** of
|
||||
the log — tier-3 disposable, indexed like any other derived structure (§8.7), rebuilt by
|
||||
replaying the log. So "all equivalences touching X" is an index lookup, not an O(scan) of Git.
|
||||
This is the clean form of the §1 three-state model: **the log is canonical; its folded current
|
||||
state is derived.**
|
||||
|
||||
**Concurrency: who may append (review B-1).** A multi-tenant L4 deployment runs several
|
||||
orchestrator instances, so "the journal is local Git, single writer" is not given. The model:
|
||||
|
||||
- **One *append authority* per information space.** Appends to a space's log are serialized
|
||||
through a single logical writer (a per-space lease/leader; instances without the lease forward
|
||||
their append intents to it). This makes the log a **totally-ordered event sequence** per space
|
||||
— the ordering authority §8.6 relies on — without a distributed transaction. Spaces are
|
||||
independent, so this scales horizontally *across* spaces (the unit of partition is the space /
|
||||
root entity, matching the tenant partition, I-13); it is a per-space serialization point, not
|
||||
a global one.
|
||||
- **Git is the durable, addressable form; appends are commits** (or fast objects batched into
|
||||
commits) under the lease — no concurrent-writer merge races because there is one writer at a
|
||||
time per space.
|
||||
- **Read-your-writes** holds within a space because every reader resolves current state from
|
||||
the same ordered log (or its fold); across spaces there is no shared state to be inconsistent.
|
||||
- **HA / failover:** the lease is time-bounded and re-grantable; a failed append-authority is
|
||||
replaced and resumes from the log's head (the log is the recovery point). A partition that
|
||||
splits the authority degrades that *space* to read-only until a single writer is re-elected —
|
||||
it never forks the log (availability yields to log integrity; an explicit, stated trade).
|
||||
- **Open residual (→ §12, O-3-adjacent):** whether very high append rates need per-space log
|
||||
*sharding* (sub-logs merged by a deterministic order) is an implementation spike, not an
|
||||
architectural change.
|
||||
|
||||
**History must stay recoverable *and* bounded (review C-3).** "Every write is a commit" + open
|
||||
L0 means an unbounded, bot-/vandalism-amplified journal that eventually degrades Git itself.
|
||||
Recoverability (I-10) is non-negotiable, so the answer is *compaction, not deletion*:
|
||||
|
||||
- **Routine git maintenance** — background `gc`/repack, commit-graph, and (for very large
|
||||
spaces) partial-clone / sparse strategies; operational, no semantic change.
|
||||
- **Squash-compaction of low-value churn (policy, §10)** — long runs of rapid same-author
|
||||
edits or revert-pairs can be folded into checkpoint commits *while preserving the recoverable
|
||||
endpoints*; what is squashed is configurable and always leaves the content recoverable (it
|
||||
compacts the *path*, not the *reachable states*).
|
||||
- **Per-shard history offload** — a git-IS-store shard keeps its own history in its own repo;
|
||||
the coordination journal references it rather than duplicating it (the journal records
|
||||
*coordination* events, not a second copy of every shard commit).
|
||||
- **Anti-abuse hooks (policy)** — rate-limiting / quarantine for anonymous L0 writers feed the
|
||||
authz/policy layer; they throttle *abuse*, never legitimate history. Recoverability is the
|
||||
floor; bounding is how it survives at scale.
|
||||
|
||||
### 8.2 Overlay / patch engine (L3)
|
||||
|
||||
The default write path for anything below write-through capability (I-5): an edit becomes a
|
||||
draft → patch/commit → MR, applied destructively only on explicit intent and only where the
|
||||
profile + policy both permit. This is what lets a read-only or rate-limited or lossy backend
|
||||
still be *edited* safely.
|
||||
|
||||
### 8.3 Federation is plural & composable (L3) — the model taxonomy
|
||||
|
||||
Federation is not one mechanism. shard-wiki selects a **federation model per space and
|
||||
composes per shard** (mechanism over policy, I-7):
|
||||
|
||||
| Model | Anchor | Coordination shape |
|
||||
|-------|--------|--------------------|
|
||||
| **Fork + journal** (default home case) | Federated Wiki | copy-with-provenance + per-page action journal (story = replay) |
|
||||
| **VCS-replication + ping** | ikiwiki | git clone/pull/push + change-ping |
|
||||
| **Query-time graph-join** | Wikibase SPARQL `SERVICE` | join remote graphs at query time, no copy |
|
||||
| **Feed aggregation** | RSS/Atom | inbound feed → pages |
|
||||
| **Activity streams** | ActivityPub | Create/Update events, notify or content-bearing |
|
||||
| **Engine-mirror** | Wiki.js DB↔Git | engine syncs its own store to a git mirror |
|
||||
|
||||
### 8.4 Union & projection (L4) — the derived cache
|
||||
|
||||
This whole layer is **derived-disposable**: recomputable from canonical state — sharded
|
||||
content + the **coordination-canonical** inputs in the journal (I-2). Crucially, the *automatic*
|
||||
equivalence results are derived, but the **human/curatorial inputs they consume — alias tables
|
||||
and curator equivalence bindings — are coordination-canonical (they live in the journal), not
|
||||
derived**; recompute reads them, never regenerates them. It comprises:
|
||||
|
||||
- **Identity resolution & equivalence** — detect "same topic / derived content" path-
|
||||
independently from *derived* signals (content fingerprint, span-set overlap) **plus** the
|
||||
*coordination-canonical* inputs (alias table, curator binding); present as
|
||||
**chorus-of-voices** or designated-canonical (a *policy* preset). (Scaling: §8.7.)
|
||||
- **Union graph** — the navigable join of pages, links, and dimensions (namespace, genealogy,
|
||||
version, shard, equivalence). A *derived lens over canonical files+journal, never a new
|
||||
store* (the ZigZag boundary).
|
||||
- **Transclusion** — one **reference-not-copy** primitive unifying Xanadu transclusion, ZigZag
|
||||
clone, Roam/Obsidian/Logseq embed, Notion synced block, Trilium note-cloning, and literate
|
||||
named-chunk assembly, over the addressable union.
|
||||
- **Projection — trivial by default, extensible for the tail.** The 95% case (Markdown in a
|
||||
shard) must cost nothing conceptually, so:
|
||||
- **Default = plain lazy replication-projection** — a freshness-stamped cache of remote
|
||||
content (§8.8). This is *the* projection for ordinary pages; it needs no taxonomy, no
|
||||
liveness reasoning, no registry. Most shards never touch anything below.
|
||||
- **Extension point — derivation-projection** — invoked *only* for content that is a
|
||||
*source* needing transform/compile/weave/evaluate (computational/typed content, §8.5). It
|
||||
adds the liveness axis (static → captured → live-over-files → view-time → irreducibly-live)
|
||||
and facets (materialization timing, multiplicity, continuity); the irreducibly-live far end
|
||||
has no faithful static form (source + a marked recording). A binding that never serves such
|
||||
content never instantiates any of this.
|
||||
- Both kinds stamp freshness + provenance; only derivation carries the liveness machinery.
|
||||
- **Moldable view registry — also an extension point, not a tax on every page.** Where a content
|
||||
type offers multiple co-equal views (typed/computed/dimensional content), they are registered
|
||||
as an **open, type-keyed set, none canonical-by-fact** (display-canonical is policy; GT prior
|
||||
art, answers the "pluggable content-type registry" question). An ordinary Markdown page has
|
||||
exactly one view and never consults the registry — the registry is queried only when a type
|
||||
declares >1 view.
|
||||
- **Derived query index** — delegate to a shard's native query engine where present
|
||||
(Roam/Logseq Datalog, Notion DB query, XWiki XWQL, Wikibase SPARQL); else build a derived
|
||||
index over the projection (the Logseq DataScript-over-files pattern). The index is
|
||||
disposable (I-2).
|
||||
|
||||
### 8.5 Computational / executable content — the scope decision
|
||||
|
||||
**In scope as a page-model + projection concern; out of scope as an execution platform.**
|
||||
shard-wiki *recognises* computational types, attaches the **canonical source**, and presents
|
||||
derived forms as **provenance- and liveness-marked projections**. Driving a derivation
|
||||
(tangle/weave, re-execute a notebook, render a sketch, evaluate a pattern) is a **gated
|
||||
capability, off by default, with a trust/sandbox concern, degrading to a captured snapshot**.
|
||||
One snapshot-provenance record (run id, source rev, timestamp, environment "unguaranteed")
|
||||
serves notebooks, renders, and recordings alike. **No INTENT amendment is required** — this
|
||||
lives inside the existing page model (L2) and projection model (L4).
|
||||
|
||||
### 8.6 Consistency, concurrency & conflict model
|
||||
|
||||
INTENT makes real-time cross-shard consistency a non-goal — but "no strong consistency" is not
|
||||
the same as "no defined consistency." This is the guarantee shard-wiki *does* offer, and the
|
||||
mechanism (not policy) that makes concurrent editing safe (review bug B-2).
|
||||
|
||||
**The consistency guarantee — causal, anchored on the journal:**
|
||||
|
||||
- **Read-your-writes for coordination-canonical state.** Once an overlay/binding/merge is
|
||||
appended to the space's decision log, every reader of that space sees it — because the log is
|
||||
a **single totally-ordered sequence per space** (one append authority, §8.1), and all readers
|
||||
resolve current state from that one order. The guarantee holds across orchestrator instances,
|
||||
not just within one process; it is cheap because ordering is per-space, never global.
|
||||
- **Causal consistency across the derived tier.** The union/index/projections reflect a causal
|
||||
cut of `(sharded inputs seen so far, journal)`. Effects never appear before their causes; a
|
||||
projection that has seen journal commit *C* has seen everything *C* depends on.
|
||||
- **Eventual convergence for sharded-canonical inputs.** Remote shard content is pulled
|
||||
asynchronously (lazily or by notify/poll, §8.7); the union converges to each shard's latest
|
||||
*as observed*, bounded by the shard's operational envelope. Freshness is always *shown*
|
||||
(provenance envelope), never faked — a stale projection is labelled stale, not wrong.
|
||||
|
||||
So: **strong + read-your-writes for what shard-wiki owns (the journal); causal for what it
|
||||
derives; eventual + freshness-labelled for what shards own.** No global clock, no distributed
|
||||
transaction, no two-phase commit across shards — none is needed, because shard-wiki coordinates
|
||||
rather than controls.
|
||||
|
||||
**Conflict detection & representation is core mechanism; only resolution is policy (I-7).**
|
||||
The split the earlier draft elided:
|
||||
|
||||
- **Detection (core).** Divergence is detected structurally: two identities resolve as
|
||||
equivalent (§8.4) but their content fingerprints differ, or an overlay's base revision no
|
||||
longer matches the shard's current revision. Detection is always on; it is never optional.
|
||||
- **Representation (core).** A detected conflict is **first-class data in the union**, not an
|
||||
error: equivalent-but-divergent pages are presented as a **coexisting set** (the
|
||||
chorus/keep-both representation), each fully attributed, with the divergence recorded in the
|
||||
provenance envelope (union without erasure — a conflict is information, not a failure).
|
||||
- **Resolution (policy).** *Which* version wins, or whether they stay coexisting, is a
|
||||
configurable preset (§10): chorus / designated-canonical / git-merge / vote-to-merge /
|
||||
overlay-only. Core never hard-codes one.
|
||||
|
||||
**Overlay-apply under source drift (the concurrent-write case).** An overlay carries the
|
||||
**base revision** of the shard content it was authored against. On apply, core compares base to
|
||||
the shard's current revision:
|
||||
|
||||
- *unchanged* → apply (fast-forward), commit to journal, propagate if the profile permits;
|
||||
- *changed, non-overlapping* → three-way merge where the merge capability allows (axis 6),
|
||||
else keep-both;
|
||||
- *changed, overlapping* → **refuse + re-present** as a conflict (above); never silently
|
||||
clobber (I-5, no silent remote mutation). The unapplied overlay remains coordination-
|
||||
canonical and valid against its base.
|
||||
|
||||
**Ordering.** The journal commit is the ordering authority for coordination-canonical effects;
|
||||
a shard-native write is only *acknowledged* in the journal after the adapter confirms it, so a
|
||||
crash between journal-intent and shard-write is recoverable (the intent is replayable, the
|
||||
write is idempotent-keyed on identity+base-rev). Cross-shard operations are ordered by their
|
||||
journal commits, giving the causal cut above.
|
||||
|
||||
**Residual open items** (tracked in *Known scaling risks & open problems*, §12, not pretended
|
||||
solved): the exact convergence bound for
|
||||
high-write CRDT shards under partition, and whether per-equivalence-set divergence needs a
|
||||
vector clock vs. a simple base-rev comparison, are deferred to implementation spikes.
|
||||
|
||||
### 8.7 Scaling the union — incremental-first, rebuild as fallback
|
||||
|
||||
The derived tier is *recomputable* (I-2) but recompute must never be the **operational**
|
||||
mechanism. A from-scratch rebuild reads every page of every shard — including rate-limited,
|
||||
paginated external APIs (Notion) and irreducibly-live sources — which can take hours to days
|
||||
and directly fights the operational-envelope axis (review C-2). So:
|
||||
|
||||
**Incremental, change-driven maintenance is the primary mechanism.** Each shard's `notify`
|
||||
capability (or a poll/ETag fallback where it has none, §8.8) emits **change events**; an event
|
||||
drives a **delta update** to exactly the affected union nodes, equivalence candidates, indexes,
|
||||
and projections. The derived tier is a continuously-maintained materialised view, not a
|
||||
periodically-recomputed one. Steady-state cost is O(changes), not O(corpus).
|
||||
|
||||
**Full rebuild is a rare, bounded fallback** — for cold start, schema/algorithm change, or
|
||||
suspected corruption — and it is **explicitly not required to be cheap**. It respects each
|
||||
shard's envelope (it may be slow, throttled, or resumable for a rate-limited shard) and runs
|
||||
*concurrently with serving the existing derived tier*; it swaps in atomically on completion.
|
||||
I-2 guarantees rebuild is *possible and correct*, not instant.
|
||||
|
||||
**Equivalence detection is indexed, not pairwise (review C-1).** Naive fingerprint/span-set
|
||||
comparison across all pages of all shards is O(N²) and is forbidden. Instead:
|
||||
|
||||
1. **Blocking / candidate generation** — cheap keys bucket pages that *could* be equivalent:
|
||||
normalised title, normalised path tail, explicit alias-table entries (coordination-
|
||||
canonical), and **MinHash/LSH bands over content shingles** for near-duplicate and
|
||||
derived-content detection. Only within-bucket pairs are considered — turning O(N²) into
|
||||
≈O(N) candidates.
|
||||
2. **Verification** — candidate pairs are confirmed by full fingerprint / span-set overlap and
|
||||
any curator binding. Confirmed equivalences become union edges.
|
||||
3. **Incremental maintenance — the delta is *not* additive (review B-4).** A changed page may
|
||||
*leave* buckets as well as *enter* them, and leaving a bucket can **break an existing
|
||||
equivalence edge** another page relied on. So a change is processed as: (i) recompute the
|
||||
page's bucket membership; (ii) for buckets it **left**, re-verify the pairs that depended on
|
||||
the shared bucket and **retract** edges no longer supported; (iii) for buckets it **entered**,
|
||||
verify the new candidate pairs and **add** edges; (iv) **propagate** to the equivalence
|
||||
neighbours of any retracted/added edge (equivalence is transitive-ish via chorus sets, so a
|
||||
retraction can split a set). Maintenance is per-change and bounded by the page's
|
||||
neighbourhood, but it covers retraction and propagation — not just additions.
|
||||
|
||||
**The index is itself derived** (disposable, recomputable) and per-tenant-partitioned (§9).
|
||||
Its parameters (LSH band/row counts, shingle size, precision/recall) are tunable; the accepted
|
||||
**false-negative rate of blocking** is a known, tracked limitation (§12) — blocking trades a
|
||||
small miss rate for tractability, and curator bindings are the escape hatch for misses.
|
||||
|
||||
**Verifying I-2 (`derived = f(canonical)`) — eventually, not on faith (review B-4).**
|
||||
Incremental maintenance can drift from a from-scratch fold over time (a missed retraction, a
|
||||
dropped event, a bug). I-2 is therefore an **eventually-verified** property, not a free one,
|
||||
and the architecture names the mechanism that verifies it:
|
||||
|
||||
- **A digest of the derived tier.** Each partition's derived tier carries a rolling content
|
||||
digest (a Merkle-style hash over union nodes/edges/index entries) maintained alongside the
|
||||
incremental updates.
|
||||
- **A background consistency-checker** periodically recomputes the digest over a *sampled* (or,
|
||||
on a slow cadence, full) fold of canonical state and compares. A mismatch localises the drift
|
||||
to a partition/region and triggers a **scoped recompute** of just that region — cheap relative
|
||||
to a global rebuild, and self-healing.
|
||||
- **So I-2 holds *eventually and verifiably*:** the incremental engine is the fast path, the
|
||||
checker is the guarantee, and divergence is detected and repaired rather than silently
|
||||
accumulating. The exact sampling rate / digest granularity is an implementation spike (§12).
|
||||
|
||||
### 8.8 Cache freshness & invalidation
|
||||
|
||||
Replication-projection caches remote shard content; cache invalidation is the actual hard part
|
||||
and was missing from the first draft (review C-2). The protocol is **per-binding, driven by the
|
||||
capability profile**, with one rule: **freshness is always represented, never assumed** — every
|
||||
cached page's provenance envelope carries `(observed-at, source-rev-if-known, staleness-state)`,
|
||||
so a consumer can always tell live from stale.
|
||||
|
||||
**Three invalidation modes, chosen by capability, not hard-coded:**
|
||||
|
||||
| Mode | When | Mechanism |
|
||||
|------|------|-----------|
|
||||
| **Event-driven (push)** | shard has `notify` | a change event invalidates exactly the affected entries and enqueues a delta refresh (§8.7); the preferred mode |
|
||||
| **Validator poll** | shard exposes ETag / Last-Modified / rev | conditional fetch (`If-None-Match`); cheap "still fresh?" checks without transferring bodies |
|
||||
| **TTL** | shard offers neither | time-bounded staleness; the floor mode (Oddmuse-class shards) |
|
||||
|
||||
Most real bindings are **hybrid**: event-driven for invalidation + a long TTL as a safety net
|
||||
for missed events + validator polls on read when an entry is past a soft age.
|
||||
|
||||
**Operational-envelope coupling.** The mode is constrained by axis-10: a **rate-limited** shard
|
||||
(Notion) *must* favour event-driven + long TTL and *must not* poll per-read — the freshness
|
||||
policy is capability-gated like everything else. A local file shard can watch the filesystem
|
||||
(near-instant invalidation, effectively event-driven for free).
|
||||
|
||||
**Thundering-herd / coalescing.** Concurrent reads of the same stale entry trigger a **single
|
||||
in-flight refresh** (single-flight); other readers await it or are served the stale-but-labelled
|
||||
value per policy. Bulk invalidations (a shard-wide event) are **batched and rate-shaped** to the
|
||||
shard's envelope rather than fired as N concurrent fetches.
|
||||
|
||||
**Staleness is a policy knob, not a correctness bug.** Whether a reader gets *stale-but-fast* or
|
||||
*blocks-for-fresh* is a §10 preset (per space or per request); either way the envelope tells the
|
||||
truth about what was served. This is union-without-erasure applied to time.
|
||||
|
||||
---
|
||||
|
||||
## 9. Cross-cut — Authorization (L5)
|
||||
|
||||
Fully specified in **`ArchitectureBlueprint.md`** (the access & history sub-blueprint);
|
||||
summarised here for completeness:
|
||||
|
||||
- **One core, a ladder of modes** L0 (open/c2, zero deps) → L1 (attributed) → L2
|
||||
(authenticated) → L3 (role/group) → L4 (multi-tenant enterprise). Climbing is configuration,
|
||||
not re-architecture.
|
||||
- **PEP** wraps every adapter op; **PDP** decides `(principal, action, target)` over actions
|
||||
`read/write/patch/merge/administer`, layered on the adapter's capability profile (a shard
|
||||
that can't write can't be written regardless of policy — L5 consults the L1 rail).
|
||||
- **Authentication delegated** to a pluggable IdentityProvider (null provider = L0 default);
|
||||
real identity from `user-engine` over `net-kingdom` IAM.
|
||||
- **Fail open only at L0, fail closed at L2+.** Authorization is pure/offline once a Principal
|
||||
is resolved. Provenance carries authz context so the union never leaks unreadable content
|
||||
(the L5↔provenance-rail interaction).
|
||||
|
||||
### 9.1 Tenant isolation of the derived tier (review B-3)
|
||||
|
||||
Read-time authz filtering is necessary but **not sufficient** when the derived tier is
|
||||
*persisted*: a single cross-tenant union/index cache guarded only by a filter on read is a
|
||||
standing leak surface (one filtering bug exposes another tenant's content). So isolation is
|
||||
**structural, not just procedural**:
|
||||
|
||||
- **The derived tier is partitioned per tenant / root entity.** A tenant maps to a root entity
|
||||
(§4); its union graph, equivalence index, projections, and caches live in a **separate
|
||||
partition** keyed by that tenant. There is no shared cross-tenant derived store to leak from.
|
||||
- **No cross-tenant equivalence by default.** Blocking/LSH (§8.7) operates *within* a partition;
|
||||
cross-tenant equivalence is an explicit, authorised, opt-in federation between roots, never an
|
||||
accident of a shared index.
|
||||
- **Read-time filtering remains, as defence-in-depth** — the provenance envelope's authz context
|
||||
is still checked, so even within a partition a principal sees only what it may; partitioning
|
||||
removes the *blast radius*, filtering removes the *fine-grained* leak.
|
||||
- **This reconciles I-2 with L5:** recomputability (a persisted-but-disposable derived tier) is
|
||||
preserved *per partition* — each tenant's derived tier is independently rebuildable from that
|
||||
tenant's canonical state — so isolation costs nothing in the rebuild model. At L0/L1 (single
|
||||
tenant) there is one partition and the machinery is invisible.
|
||||
|
||||
**Isolation invariant (add to §2 as I-13):** *derived state is partitioned by tenant; no
|
||||
derived artifact spans tenants except through an explicit, authorised cross-root federation.*
|
||||
|
||||
---
|
||||
|
||||
## 10. The policy surface (mechanism over policy, made concrete)
|
||||
|
||||
I-7 only means something if the policy knobs are enumerated and kept *out* of core algorithms.
|
||||
The configurable presets are:
|
||||
|
||||
- **Canonical-source policy** — chorus / designated-canonical / git-merge / overlay-only /
|
||||
vote-to-merge (per space or per equivalence set).
|
||||
- **Federation model** — the §8.3 taxonomy, per space, composable per shard.
|
||||
- **Shard mode** — read-only / write-through / mirrored / projected / cached / canonical
|
||||
(constrained by the capability profile).
|
||||
- **Reconciliation cadence & conflict exposure** — push/poll/manual; show-conflicts vs
|
||||
auto-merge-when-supported.
|
||||
- **Conflict-resolution preset** — chorus / designated-canonical / git-merge / vote-to-merge /
|
||||
overlay-only (the *resolution* policy over §8.6's core detection; per space or equivalence set).
|
||||
- **Freshness / invalidation mode** — event-driven / validator-poll / TTL / hybrid, and
|
||||
stale-but-fast vs block-for-fresh on read (§8.8; constrained by the operational envelope).
|
||||
- **History compaction** — squash policy for low-value churn, gc/repack cadence, per-shard
|
||||
offload (§8.1), always preserving recoverable endpoints.
|
||||
- **Tenant partition mapping** — tenant ↔ root-entity, and any explicit cross-root federation
|
||||
(§9.1, I-13).
|
||||
- **Execution policy** — derive/execute off (default) / sandboxed / per-shard-allowed.
|
||||
- **Authorization mode** — the L0–L4 ladder.
|
||||
- **Projection materialization** — lazy/eager; snapshot vs view-time; recording retention.
|
||||
|
||||
Core ships sane defaults (L0 open; fork+journal; lazy replication-projection; event-driven+TTL
|
||||
freshness; overlay-before-mutation; execution off; one tenant = one root) and never hard-codes
|
||||
any of the above. (**Preset bundles** that package coherent knob-sets per persona are tracked
|
||||
as O-8, §12 — flexibility without bundles is operator burden.)
|
||||
|
||||
---
|
||||
|
||||
## 11. Concrete module structure (bridge to implementation)
|
||||
|
||||
A proposed package layout for `src/shard_wiki/`, mapping 1:1 to the layers so the dependency
|
||||
rule (downward only; the derived tier is incrementally maintained, rebuild = fallback) is
|
||||
enforceable by import lint:
|
||||
|
||||
```
|
||||
src/shard_wiki/
|
||||
model/ # L2 top waist: Page, Identity, Placement, ProvenanceEnvelope,
|
||||
# Span, the page-shape types; capability-spectrum value types
|
||||
adapters/ # L1 bottom waist: AdapterContract (versioned iface), CapabilityProfile,
|
||||
# attachment-mode binding; concrete adapters:
|
||||
git/ folder/ gitea/ obsidian/ webdav/ notion/ … # each: profile + verbs
|
||||
coordination/ # L3: DecisionLog (append-only, git-backed, per-space append authority/
|
||||
# lease), OverlayEngine (draft→patch→MR), reconcile
|
||||
# (current coordination state = a derived fold → lives in union/)
|
||||
federation/ # L3: FederationModel strategies (fork_journal, vcs_ping,
|
||||
# graph_join, feed, activitypub, engine_mirror)
|
||||
union/ # L4 (derived): IdentityResolver, EquivalenceGraph, UnionGraph,
|
||||
# Transclusion (reference-not-copy)
|
||||
projection/ # L4 (derived): ReplicationProjection, DerivationProjection,
|
||||
# ViewRegistry (moldable), QueryIndex (delegate|derive)
|
||||
authz/ # L5 cross-cut: PDP, PEP, IdentityProvider iface, NullProvider
|
||||
provenance/ # cross-cut LEAF: ProvenanceEnvelope type + ⊕ (effective) only — pure data
|
||||
policy/ # cross-cut LEAF: the §10 policy surface (presets + a resolve() read by
|
||||
# coordination/federation/projection/authz); owns NO mechanism
|
||||
api/ # L6: orchestrator API (server-side union for agents/CLI)
|
||||
```
|
||||
|
||||
**The cross-cutting rails are leaves, not god-modules (review D-4).** `provenance/` and
|
||||
`policy/` are imported widely, so they are the highest coupling risk; the discipline that caps
|
||||
it is: **they may import *nothing* in the tree and contain *only* stable data types + pure
|
||||
functions** (the envelope and its `⊕`; the policy presets and a `resolve(question) → choice`).
|
||||
Mechanism never lives in a rail — `policy/` says *what* the preset is, `coordination/`/
|
||||
`projection/` decide *how* to honour it. A change to a rail is then a change to a small, stable,
|
||||
dependency-free leaf, not a ripple through every layer. Capability-spectrum value types live in
|
||||
`model/` (also leaf-like) for the same reason.
|
||||
|
||||
Hard import rules (enforced by import lint):
|
||||
- `union/` and `projection/` may import `model/`, `adapters/`, `coordination/`, `policy/`,
|
||||
`provenance/` — but **nothing may import them** (they are the disposable derived tier).
|
||||
- `model/`, `adapters/`, `provenance/`, `policy/` import nothing else in the tree (the waists
|
||||
and rails stay thin); `provenance/` and `policy/` import nothing at all.
|
||||
- `coordination/` and `federation/` may import the waists + rails, never the derived tier.
|
||||
|
||||
---
|
||||
|
||||
## 12. Known scaling risks & open problems
|
||||
|
||||
Tracked honestly rather than pretend-solved (review disposition F). Each has a **chosen
|
||||
direction** and a **revisit trigger** — the thing that, if observed, forces a redesign.
|
||||
|
||||
| # | Risk / open problem | Chosen direction | Revisit trigger |
|
||||
|---|---------------------|------------------|-----------------|
|
||||
| O-1 | **Equivalence blocking misses true matches** (LSH false negatives, §8.7) | accept a small miss rate; curator bindings are the escape hatch | measured recall below an agreed threshold on real corpora |
|
||||
| O-2 | **Convergence bound for high-write CRDT shards under partition** (§8.6) | causal via journal + CRDT-native merge at the shard; no global bound promised | user-visible divergence that outlives a partition |
|
||||
| O-3 | **Per-equivalence-set divergence tracking** (§8.6) | start with base-rev comparison; add vector clocks only if needed | 3-way concurrent divergence that base-rev mis-orders |
|
||||
| O-4 | **Persisted derived-tier cost ceiling** (§8.7/§9.1) | per-tenant partition, incremental-maintained, rebuild is fallback | a tenant whose incremental cost still exceeds budget |
|
||||
| O-5 | **Axis-interaction completeness** (§6.5) | the named interaction table is the contract; extend deliberately | a real adapter needing an interaction not in the table |
|
||||
| O-6 | **Span-address portability across projection** (§7.2) | shard-scoped native-id wrapping now; tumbler later | cross-shard transclusion that native ids can't satisfy |
|
||||
| O-7 | **Squash-compaction vs. perfect auditability** (§8.1) | compact the *path*, preserve reachable states; configurable | a compliance need for every intermediate keystroke |
|
||||
| O-8 | **Policy-knob proliferation → operator burden** (§10) | ship named **preset bundles** ("personal vault" / "team wiki" / "enterprise federation") over the policy surface | operators mis-configuring interacting knobs |
|
||||
| O-9 | **Shard sharing across roots vs tenant partition** (§9.1, I-13) | shard exclusive to one root by default; explicit shared-read binding otherwise (avoids double-caching a rate-limited shard) | a shard legitimately needed live in two tenants |
|
||||
| O-10 | **Span-level authz under transclusion** (aggregation/inference leak; ⊕ across boundaries, §7.3/§9) | a transcluded span inherits the **stricter** of source & host authz; provenance ⊕ composes the source-page envelope under the host | a real cross-authz transclusion |
|
||||
| O-11 | **Union under shard unavailability** (§8.8 covers stale, not down) | **partial union** + per-shard "unavailable" provenance + last-known-projection where policy allows | an SLA need on partial reads |
|
||||
| O-12 | **Per-space append-log throughput ceiling** (§8.1 append authority) | single writer per space scales across spaces; per-space log sharding if needed | a single space exceeding one writer's append rate |
|
||||
|
||||
These are the spec-writing inputs for `SHARD-WP-0002`; none blocks the architecture, each
|
||||
scopes an implementation spike.
|
||||
|
||||
---
|
||||
|
||||
## 13. Canonical data flows (the architecture exercised)
|
||||
|
||||
**A. Attach a shard.** Adapter binds (chosen attachment mode) → probes/declares a capability
|
||||
profile → core registers the shard under a root entity → if not git-native, the coordination
|
||||
journal is seeded (begin-now/mirror/import per axis 5). No union rebuild yet (lazy).
|
||||
|
||||
**B. Read a page through the union.** Consumer asks the union for an identity → Identity
|
||||
resolver maps it to placements across shards → equivalence yields chorus or canonical →
|
||||
replication-projection lazily fetches from each shard (cache + freshness) → page returned
|
||||
wrapped in its provenance envelope → L5 filters anything the principal can't see at source.
|
||||
|
||||
**C. Edit a read-only / limited shard.** Write request → L5 PDP allows → capability profile
|
||||
says < write-through → OverlayEngine records a draft → renders a patch/MR in the shard's native
|
||||
syntax (lossless) or Markdown (lossy-with-report) → on explicit apply, commit to the journal
|
||||
and (if the profile permits) propagate; otherwise the overlay stands as the local truth, fully
|
||||
attributed.
|
||||
|
||||
**D. Attach a computational notebook.** Adapter declares profile (attachment=file-store,
|
||||
opacity=mixed, computational=captured-output). Core attaches the `.ipynb` **source** as
|
||||
canonical; presents cells + embedded outputs as **derivation-projection snapshots** marked
|
||||
"run N, env unguaranteed"; offers a static render via the view registry; re-execution stays
|
||||
gated off. History uses paired-text/nbdime per axis 5.
|
||||
|
||||
---
|
||||
|
||||
## 14. Key tradeoffs & decisions
|
||||
|
||||
Decided:
|
||||
|
||||
- **Capability spectra over a verb checklist** — richer contract for precise, uniform
|
||||
degradation; tamed by an orthogonal core + implied positions + a named interaction table
|
||||
(§6.5). (Decided.)
|
||||
- **Three states; derived = f(canonical)** — sharded + coordination canonical, derived
|
||||
disposable (§1). (Decided; supersedes the earlier "edges vs middle" framing.)
|
||||
- **Event-sourced coordination, one append authority per space** — coordination-canonical state
|
||||
is an append-only **decision log** in the git journal; current state is a derived fold; a
|
||||
per-space append lease gives a totally-ordered log and read-your-writes across orchestrator
|
||||
instances (§8.1). (Decided — resolves the single-vs-multi-writer keystone.)
|
||||
- **Profiles are verified, not asserted** — a versioned **conformance suite** gates adapter
|
||||
admission; capability-as-data acts on verified data (§6.6). (Decided.)
|
||||
- **I-2 is eventually-verified** — incremental maintenance is the fast path; a digest +
|
||||
background consistency-checker detects and self-heals drift (§8.7). (Decided.)
|
||||
- **Incremental-first, rebuild-as-fallback** — the derived tier is continuously maintained from
|
||||
change events; full rebuild is rare and need not be cheap (§8.7). (Decided — resolves the
|
||||
earlier "union graph persistence" open item: **persisted, per-tenant, incrementally
|
||||
maintained, rebuildable**, §9.1.)
|
||||
- **Identity ≠ fingerprint** — page identity is a stable handle; fingerprints are for
|
||||
equivalence (§7.2). (Decided.)
|
||||
- **Consistency = read-your-writes (journal) + causal (derived) + eventual/freshness-labelled
|
||||
(shards)**; conflict detection/representation is core, resolution is policy (§8.6). (Decided.)
|
||||
- **Address scheme** — shard-scoped native-id wrapping now; portable tumbler later (§7.2, O-6).
|
||||
(Decided.)
|
||||
- **Default federation = fork+journal over Git**; other models opt-in (§8.3). (Decided.)
|
||||
- **Execution off by default** — recognise+project always; execute only when gated (§8.5). (Decided.)
|
||||
- **Derived tier is tenant-partitioned** (I-13, §9.1). (Decided.)
|
||||
|
||||
Still open (carried to §12 / policy):
|
||||
|
||||
1. **L1 "attributed-but-open" mode** — ship it or jump L0→L2? (Carried from ArchitectureBlueprint.)
|
||||
2. **Per-page ACL default** — off (per-shard/namespace) confirmed; revisit only if demand appears.
|
||||
3. The implementation spikes in **§12** (O-1…O-7).
|
||||
|
||||
---
|
||||
|
||||
## 15. What this architecture is *not*
|
||||
|
||||
- Not a wiki engine, UI, or rendering pipeline (those are consumers at L6).
|
||||
- Not a canonical-source-of-truth — shards keep sovereignty; the middle is derived.
|
||||
- Not a generic file-sync daemon — synchronisation is wiki-page-semantic.
|
||||
- Not an execution platform — computation is recognised and projected, not hosted.
|
||||
- Not a universal ontology — no single schema is imposed on all shards.
|
||||
- Not an authentication/identity store — that is delegated (authorization is owned).
|
||||
|
||||
---
|
||||
|
||||
## 16. Traceability
|
||||
|
||||
- **INTENT** — every invariant in §2 (I-1…I-13) cites an INTENT principle or boundary; no
|
||||
invariant contradicts the Stability Note.
|
||||
- **Review & hardening** — this revision folds in
|
||||
`history/260615-core-architecture-blueprint-review.md` via **`SHARD-WP-0005`**: A-1→§1/§3/§4
|
||||
(three-state re-frame), B-1→§7.2 (identity vs equivalence), B-2→§8.6 (consistency/conflict),
|
||||
B-3→§9.1+I-13 (tenant isolation), C-1/C-2→§8.7/§8.8 (incremental + indexed + invalidation),
|
||||
C-3→§8.1 (history scaling), D-1→§6.5 (orthogonal core), D-2→§7.3 (layered provenance),
|
||||
D-3→§8.4 (common-case projection), D-4→§11 (policy module + rail discipline); open items→§12.
|
||||
- **Round-2 review & hardening II** — folds in
|
||||
`history/260615-core-architecture-blueprint-review-2.md` via **`SHARD-WP-0006`**:
|
||||
A-1…A-4→§3/§4/§10/§11 (overview reconciled to the body), B-1+B-3→§8.1 (event-sourced
|
||||
coordination + per-space append authority), B-2→§6.6 (adapter conformance suite),
|
||||
B-4→§8.7 (incremental retraction/propagation + I-2 digest/checker); C-1…C-4→§12 (O-8…O-11).
|
||||
- **Research** — §6 (spectra) ← `260614-shard-spectrum-synthesis` v3; §8.3 (federation
|
||||
taxonomy) ← v3 §2.5; §8.4–§8.5 (two-axis projection, view registry, computational scope) ←
|
||||
`260614-computational-page-model-synthesis`; §7 page shapes ← the engine + modern-tool +
|
||||
computational dives; §1 thesis ← the files-canonical/index-derived through-line across
|
||||
Logseq/ikiwiki/GT/Pharo/Jupyter.
|
||||
- **Use cases** — the architecture is sized to UC-01–UC-84: federation/coordination (UC-01–07,
|
||||
26–33, 56, 71–72, 79) → §8; attachment/adapter (UC-34–43, 50, 53, 57, 60–62, 64–66, 68–70,
|
||||
76–82) → §6; page model & fidelity (UC-34, 39, 42, 55, 58–59, 67, 73, 80, 83–84) → §7/§8.5;
|
||||
addressing/identity/query (UC-32, 44–49, 51–52, 54, 63, 74) → §7.2/§8.4; provenance &
|
||||
metadata (UC-24–25, 75) → the provenance rail; collaboration & discovery (UC-08–23) → L6
|
||||
consumers over the union.
|
||||
- **Workplans** — §6–§8 are the design target of `SHARD-WP-0002` (T11–T18); §9 is owned by
|
||||
`ArchitectureBlueprint.md`; §1 (yawex-derived resolution/overlay) aligns with
|
||||
`SHARD-WP-0001`.
|
||||
|
||||
---
|
||||
|
||||
## 17. Stability note
|
||||
|
||||
This document defines shard-wiki's **internal** architecture; it may evolve as the spec
|
||||
workplans land. But the **thesis (§1)**, the **invariants (§2)**, and the **dual narrow waist
|
||||
(§1, §6, §7)** are load-bearing — changing any of them is an architectural change in the sense
|
||||
of INTENT's Stability Note and should be rare and deliberate.
|
||||
220
spec/FederationArchitecture.md
Normal file
220
spec/FederationArchitecture.md
Normal file
@@ -0,0 +1,220 @@
|
||||
# FederationArchitecture
|
||||
|
||||
Status: **draft for review** · Date: 2026-06-15 · Deliverable of **SHARD-WP-0002** (T1–T10, T17)
|
||||
|
||||
The federation **design decisions** for shard-wiki: *what the union does*. It records, per
|
||||
topic, a decision with rationale, tradeoffs, and a **Decided / Deferred / Open** footer. It
|
||||
**references** `spec/CoreArchitectureBlueprint.md` (the whole-system architecture) and
|
||||
`spec/FederationRequirements.md` (yawex-derived ADRs) rather than restating them; the adapter
|
||||
contract (*what a backend must expose*) is the companion deliverable in
|
||||
`spec/TechnicalSpecificationDocument.md` §A (T11–T16, T18). UC references → `UseCaseCatalog.md`.
|
||||
|
||||
Cross-cutting invariants assumed throughout (blueprint §2): orchestrator-not-engine (I-1),
|
||||
three-state canonical/derived (I-2), capability-as-data (I-3), union-without-erasure (I-4),
|
||||
overlay-before-mutation (I-5), git-addressable coordination (I-6), mechanism-over-policy (I-7),
|
||||
graceful degradation (I-8), identity≠placement (I-9), history-as-floor (I-10), authz-in-core
|
||||
(I-11), tenant-partitioned derived state (I-13).
|
||||
|
||||
---
|
||||
|
||||
## T1 — Orchestrator positioning & boundaries
|
||||
|
||||
**Decision.** shard-wiki is an **adapter-layer orchestrator** in a **star** shape (many
|
||||
sovereign shards ↔ one information space), **not** a homogeneous federation network. It
|
||||
**compares, does not equate** (Federated Wiki homogeneous JSON sites, ikiwiki homogeneous git
|
||||
wikis, ActivityPub activity streams are *models it can speak*, §T17 — not shapes it imposes).
|
||||
Core owns: the union, the coordination journal, projection, overlay, resolution, authorization.
|
||||
Adapters own backend specifics; UI and editorial policy live outside core. (Blueprint §1, §5;
|
||||
UC-02, UC-26.)
|
||||
|
||||
*Decided:* star orchestrator; compare-not-equate; core/adapter/UI/policy boundaries.
|
||||
*Deferred:* the reference UI (L6) is out of scope here. *Open:* none.
|
||||
|
||||
## T2 — Remix primitives: reference / projection / overlay / import-fork
|
||||
|
||||
**Decision.** Four primitives, escalating in commitment; **overlay-before-mutation is the
|
||||
default write path** (I-5), fork is *one federation model* (§T17) not the default for editing:
|
||||
|
||||
| Primitive | Trigger | Writes remote? | Coordination-canonical? |
|
||||
|-----------|---------|----------------|--------------------------|
|
||||
| **Reference** | link only | no | no (a link in content) |
|
||||
| **Projection** | read remote page | no (cache) | no (derived) |
|
||||
| **Overlay** | edit a sub-write-through shard | not until explicit apply | **yes** (decision log) |
|
||||
| **Import / fork** | copy into a writable shard / fork-with-provenance | source unchanged | **yes** (fork event) |
|
||||
|
||||
(Blueprint §8.2; FederationRequirements ADR-05; UC-04, UC-26, UC-29.)
|
||||
|
||||
*Decided:* the four primitives + overlay-default. *Deferred:* fork-attribution portability
|
||||
format. *Open:* whether "import" and "fork" are one primitive with a flag or two (impl spike).
|
||||
|
||||
## T3 — Equivalent-page identity & multi-version presentation
|
||||
|
||||
**Decision.** Separate **identity** (stable handle), **placement** (N paths/shards), and
|
||||
**equivalence** (content-fingerprint / span-set overlap *across* identities), per
|
||||
FederationRequirements ADR-01/ADR-02 (I-9). Equivalence signals: normalized title/path, alias
|
||||
table (coordination-canonical), link-graph overlap, fingerprint, **curator binding**. Default
|
||||
presentation = **chorus-of-voices** (all equivalent versions, attributed); **designated-
|
||||
canonical** is a policy preset (§T9). Divergence is data in the provenance envelope (I-4),
|
||||
detected as core mechanism (blueprint §8.6). (UC-07, UC-27; UC-46.)
|
||||
|
||||
*Decided:* identity/placement/equivalence split; chorus default. *Deferred:* fingerprint
|
||||
algorithm + blocking params (blueprint O-1). *Open:* curator-binding UX (L6).
|
||||
|
||||
## T4 — History, attribution & the coordination journal
|
||||
|
||||
**Decision.** Each information space has a **git-addressable, event-sourced coordination
|
||||
journal** (blueprint §8.1, I-6/I-10): coordination-canonical decisions (overlay/binding/alias/
|
||||
merge/fork) are append-only events; current state is a derived fold. **One append authority per
|
||||
space** gives a total order (read-your-writes across instances). Per-shard native history is
|
||||
**adopted** (git-native), **supplemented** (DB/CRDT internal → journal begins-now/mirrors), or
|
||||
**imported** (open file history backfilled) per the history-portability axis (TSD §A T13).
|
||||
Attribution is portable on the event (UC-29). (UC-29, UC-33.)
|
||||
|
||||
*Decided:* event-sourced journal + per-space append authority + adopt/supplement/import.
|
||||
*Deferred:* per-space log sharding for extreme write rates (blueprint O-12). *Open:* none.
|
||||
|
||||
## T5 — Union composition layer
|
||||
|
||||
**Decision.** The **server-side orchestrator union is primary** — the derived tier (union
|
||||
graph, indexes, projections) is composed in core for agents/CLI/non-browser consumers
|
||||
(blueprint §8.4), **incrementally maintained** (not recomputed, §8.7), **tenant-partitioned**
|
||||
(I-13). **Client-side composition** (Federated-Wiki-style browser pull) is a supported
|
||||
*consumer pattern over the same union*, not the only path. Freshness/caching per §8.8. (UC-05,
|
||||
UC-27, UC-03, UC-31.)
|
||||
|
||||
*Decided:* server-primary, client-optional; incremental, partitioned. *Deferred:* client
|
||||
SDK shape. *Open:* persisted-union digest granularity (blueprint O-4, B-4 checker).
|
||||
|
||||
## T6 — Change notification & subscription transports
|
||||
|
||||
**Decision.** Change-notification is an **optional adapter capability** (`notify`), and it is
|
||||
the **primary driver of incremental maintenance** (blueprint §8.7/§8.8). Transports, per
|
||||
profile: git hook / ikiwiki-style ping, ActivityPub Create/Update, webhook, WebDAV/HTTP ETag
|
||||
or `Last-Modified` poll, plain polling fallback. A notification → invalidate affected entries
|
||||
+ enqueue delta refresh + a RecentChanges union entry (FederationRequirements ADR-03; UC-31,
|
||||
UC-17). Rate-limited shards favour event-driven + long TTL (operational-envelope axis).
|
||||
|
||||
*Decided:* notify = optional capability, drives incremental + RecentChanges; transport per
|
||||
profile. *Deferred:* ActivityPub as content-bearing (vs notify-only) — start notify-only.
|
||||
*Open:* fediverse-dependency posture for v1.
|
||||
|
||||
## T7 — Information-space lifecycle
|
||||
|
||||
**Decision.** Root entities have lifecycle states: **active → read-only-archived → retired
|
||||
(detached)**, and **merged-into-successor**. **Carry-forward** is selective import from an
|
||||
archived shard (UC-28, UC-30). **Space-fork / branch** (UC-33) = branch the coordination
|
||||
journal + shard-config (a fork event, §T2/§T17 fork+journal model). Retirement **preserves
|
||||
read-only union entries** by default (history-as-floor, I-10), never hard-deletes projections.
|
||||
(UC-28, UC-30, UC-33.)
|
||||
|
||||
*Decided:* the lifecycle states + carry-forward + space-branch; retire-preserves. *Deferred:*
|
||||
ephemeral "happenings" as a first-class mode. *Open:* GC policy for retired-space derived tier.
|
||||
|
||||
## T8 — Transclusion & projection depth
|
||||
|
||||
**Decision.** Transclusion is **one reference-not-copy primitive** over the addressable union
|
||||
(blueprint §8.4), at three depths:
|
||||
|
||||
| Level | UC | Behaviour |
|
||||
|-------|-----|-----------|
|
||||
| Whole-page projection | UC-03 | lazy full page from a remote shard (replication-projection) |
|
||||
| Block/span transclusion | UC-32 | inline embed by span address, with origin + freshness |
|
||||
| Link reference | UC-08 | pointer only (ADR-06) |
|
||||
|
||||
Every transcluded artifact carries provenance + **staleness** (I-4, §8.8); live-transclusion
|
||||
fragility (link rot / remote down) degrades to last-known + an `unavailable`/`stale` marker
|
||||
(blueprint O-11). Span-level authz under transclusion = the stricter of source/host (O-10).
|
||||
Reference-not-copy unifies Xanadu transclusion, ZigZag clone, embeds, synced blocks, Trilium
|
||||
note-clone, literate named-chunk. (UC-03, UC-32.)
|
||||
|
||||
*Decided:* one primitive, three depths, provenance+staleness mandatory. *Deferred:* snapshot-
|
||||
on-import vs live default (a policy). *Open:* span-authz composition (O-10), addressing (O-6).
|
||||
|
||||
## T9 — Consensus & reconciliation policy catalog
|
||||
|
||||
**Decision.** **Detection is core, resolution is policy** (I-7, blueprint §8.6). Core always
|
||||
detects divergence and represents it as a coexisting (chorus) set; the **resolution preset** is
|
||||
configurable per space / per equivalence set:
|
||||
|
||||
- **chorus-spread** (versions coexist; default) · **designated-canonical** (explicit write
|
||||
target) · **git-merge** (where merge capability allows) · **vote-to-merge / editorial gate**
|
||||
(optional, L6-assisted) · **overlay-only** (no destructive merge on read-only sources).
|
||||
|
||||
(FederationRequirements ADR-03/ADR-05; UC-07, UC-27.)
|
||||
|
||||
*Decided:* the preset catalog; detection-core/resolution-policy. *Deferred:* vote/editorial
|
||||
gate mechanics (L6). *Open:* default preset per persona bundle (blueprint O-8).
|
||||
|
||||
## T10 — Federation-operations capability matrix
|
||||
|
||||
**Decision.** Each federation operation requires a minimum **verified capability profile**
|
||||
(TSD §A T11/§6.6); below it, the op **degrades** along the named interaction axes (blueprint
|
||||
§6.5) rather than failing. The matrix (consuming the T11 vocabulary):
|
||||
|
||||
| Federation op | Min capabilities | Degradation when absent |
|
||||
|---------------|------------------|--------------------------|
|
||||
| **read / project** | `read` | (floor — every shard) |
|
||||
| **transclude span** | `read` + addressing≥span | whole-page projection only |
|
||||
| **overlay** | `read` | always available (overlay is local) |
|
||||
| **write-through** | `write` (+ `merge` for concurrent) | → overlay/patch target |
|
||||
| **diff / 3-way merge** | `diff`,`merge` | → keep-both / chorus |
|
||||
| **notify-driven freshness** | `notify` | → validator-poll → TTL (§8.8) |
|
||||
| **native search delegate** | native-query≥index | → derived index over projection |
|
||||
| **publish / mirror** | `publish` | → read-only / static projection |
|
||||
| **fork+journal federation** | `read` + history (any) | → projection-only participant |
|
||||
|
||||
Every backend, down to the Oddmuse flat-file floor, is usable as **read-only / cache /
|
||||
projection / backup / patch target** (I-8). This matrix and the T11 spectra are kept in sync
|
||||
(it consumes that vocabulary). (UC-02–UC-07.)
|
||||
|
||||
*Decided:* the op→capability matrix + degradation paths. *Deferred:* exhaustive per-op test
|
||||
vectors → the conformance suite (§6.6). *Open:* axis-interaction completeness (blueprint O-5).
|
||||
|
||||
---
|
||||
|
||||
## T17 — Federation-model taxonomy (selectable / composable)
|
||||
|
||||
**Decision.** Federation is **plural and composable**, not one mechanism (blueprint §8.3). A
|
||||
space selects a model (composing per shard); the default is **fork+journal over Git** (the home
|
||||
case):
|
||||
|
||||
| Model | Anchor | Coordination shape | Capability floor | UCs |
|
||||
|-------|--------|--------------------|------------------|-----|
|
||||
| **Fork + journal** *(default)* | Federated Wiki | copy-with-provenance + per-page action journal (story = replay) | `read` + journal | UC-26, UC-71, UC-72 |
|
||||
| **VCS-replication + ping** | ikiwiki | git clone/pull/push + change-ping | git-IS-store | UC-31, UC-33, UC-79 |
|
||||
| **Query-time graph-join** | Wikibase `SERVICE` | join remote graphs at query, no copy | native-query≥graph | UC-74 |
|
||||
| **Feed aggregation** | RSS/Atom | inbound feed → pages | `read` | UC-03 |
|
||||
| **Activity streams** | ActivityPub | Create/Update events (notify or content) | `notify` | UC-31 |
|
||||
| **Engine-mirror** | Wiki.js DB↔Git | engine mirrors its store to git | `publish`/mirror | UC-68, UC-69 |
|
||||
|
||||
The coordination journal (T4) is the fork+journal model's home; **git IS the journal** in
|
||||
VCS-replication and engine-mirror (forge wikis make git the store *and* journal, resolving the
|
||||
engine-mirror write-race). Models compose: e.g. fork+journal locally, query-join for a
|
||||
typed-graph shard, feed-aggregation for an external source. (Mechanism over policy, I-7.)
|
||||
|
||||
*Decided:* the six models, selectable per space + composable per shard, default fork+journal.
|
||||
*Deferred:* activity-streams content-bearing mode. *Open:* multi-model interaction edge cases
|
||||
(when two models touch one shard).
|
||||
|
||||
---
|
||||
|
||||
## Consolidated decisions / deferred / open
|
||||
|
||||
- **Decided (architecture-settling):** star orchestrator (T1); overlay-default remix (T2);
|
||||
identity/placement/equivalence + chorus (T3); event-sourced journal + append authority (T4);
|
||||
server-primary incremental union (T5); notify-driven freshness (T6); space lifecycle (T7);
|
||||
reference-not-copy transclusion (T8); detection-core/resolution-policy preset catalog (T9);
|
||||
op→capability matrix (T10); selectable/composable federation models (T17).
|
||||
- **Deferred (to impl / L6 / persona bundles):** reference UI; fork-attribution format; client
|
||||
SDK; vote/editorial mechanics; ephemeral spaces; ActivityPub content-bearing; default-preset
|
||||
bundles (O-8).
|
||||
- **Open (tracked in blueprint §12):** O-1 equivalence blocking, O-4 union digest, O-5 axis
|
||||
interactions, O-6 addressing, O-8 presets, O-10 span-authz, O-11 unavailability, O-12 log
|
||||
throughput.
|
||||
|
||||
## Acceptance (SHARD-WP-0002 federation half)
|
||||
|
||||
T1–T10 + T17 each have a decision record with a Decided/Deferred/Open footer; all honour INTENT;
|
||||
UC-26–UC-33, UC-56, UC-71–UC-72, UC-74, UC-79 trace here; the adapter-contract companion is
|
||||
`spec/TechnicalSpecificationDocument.md` §A (T11–T16, T18). Conflicts with SHARD-WP-0001 are
|
||||
none (FederationRequirements ADRs are consumed, not duplicated).
|
||||
214
spec/FederationRequirements.md
Normal file
214
spec/FederationRequirements.md
Normal file
@@ -0,0 +1,214 @@
|
||||
# FederationRequirements — yawex-derived design notes
|
||||
|
||||
Status: **draft for review** · Date: 2026-06-15 · Deliverable of **SHARD-WP-0001**
|
||||
|
||||
Concrete, ADR-ready design decisions for the union/federation layer, distilled from the yawex
|
||||
prior-art review (`research/260608-yawex-prior-art/findings.md`) and made **consistent with**
|
||||
`spec/CoreArchitectureBlueprint.md` (the whole-system architecture) and `INTENT.md`. yawex is
|
||||
*inspiration and a case checklist*, never an interface to inherit (decision 2026-06-08).
|
||||
|
||||
These are **requirements** (what the union must do); `spec/FederationArchitecture.md`
|
||||
(SHARD-WP-0002) is the architecture that realises them. UC references are to
|
||||
`spec/UseCaseCatalog.md`.
|
||||
|
||||
**Resolved precondition.** The yawex "minimal access model in core" thread is **settled**:
|
||||
authorization-in-core / authentication-delegated, the L0–L4 ladder — INTENT amended,
|
||||
`ArchitectureBlueprint.md` ratified. Auth is therefore *not* re-litigated here; these ADRs
|
||||
assume it (the resolver and overlay paths run behind the L5 PEP/PDP).
|
||||
|
||||
---
|
||||
|
||||
## ADR-01 — Cross-shard page resolution
|
||||
|
||||
**Status:** accepted · **CoreArchitectureBlueprint:** §8.4 (union), §7.2 (identity)
|
||||
|
||||
**Context.** yawex's most-tested component was `PageLookUp`, a context-relative resolver with
|
||||
ordered states `LOCAL, CLIMB, DROP, GLOBAL, REMOTE, SWITCH, JUMP, VIRTUAL, FAILED`. Decision
|
||||
(2026-06-08): **inspiration only** — use the states as a *checklist of cases*, design the
|
||||
federation resolver fresh.
|
||||
|
||||
**Decision.** Resolution is a pure function over the derived union:
|
||||
`resolve(name, from_context, policy) → ResolutionResult`, evaluated in a defined order and
|
||||
keyed on **page identity** (the stable handle), not path (I-9):
|
||||
|
||||
1. **Same-namespace, same-shard** exact identity match — *(yawex LOCAL)*.
|
||||
2. **Namespace walk** — climb/descend ancestor namespaces of `from_context` — *(CLIMB/DROP)*.
|
||||
3. **Union lookup** — match identity / alias-table entry across all attached shards — *(GLOBAL)*.
|
||||
4. **Equivalence set** — if several shards hold an equivalent page, return the **chorus set**;
|
||||
the *canonical-source* policy preset (chorus / designated-canonical) decides presentation —
|
||||
never silently pick one (union without erasure, I-4). *(implicit in yawex; explicit here.)*
|
||||
5. **Projection / virtual** — a page whose content lives in a remote shard (lazy replication,
|
||||
*REMOTE*) or is computed/query-defined (derivation, *VIRTUAL*).
|
||||
6. **Explicit address** — a shard- or space-qualified reference resolves directly — *(SWITCH/JUMP)*.
|
||||
7. **Not found** → **red-link**: a resolvable *target* with no page yet, createable (UC-23) —
|
||||
*(FAILED)*.
|
||||
|
||||
`ResolutionResult` carries the resolved identity (or red-link), the placement(s), the
|
||||
provenance envelope (ADR-04), and — on ambiguity — the full chorus set. Resolution is a
|
||||
**derived-tier read** (union graph + equivalence index, §8.4); it is incremental-maintained and
|
||||
respects the L5 authz filter (a principal never resolves to content it can't see).
|
||||
|
||||
**Consequences.** Deterministic, policy-driven, no privileged shard; every yawex state has a
|
||||
home without inheriting its structure. **Open:** cross-space `JUMP` addressing syntax ties to
|
||||
the portable-address scheme (CoreArchitectureBlueprint O-6).
|
||||
|
||||
---
|
||||
|
||||
## ADR-02 — Namespace / path model & shard roles
|
||||
|
||||
**Status:** accepted · **CoreArchitectureBlueprint:** §7.2 (identity≠placement), §4 (shard mode)
|
||||
|
||||
**Context.** yawex modelled **topics as directories** (a topic's gateway page shares the dir
|
||||
name) with relative (`../`) / absolute (`/`) paths and normalization, and page classes
|
||||
`local / global / virtual`.
|
||||
|
||||
**Decision.**
|
||||
- The union exposes a **namespace tree**; a **path is a placement coordinate, not an identity**
|
||||
(I-9). One identity may have **N placements** (paths/shards) → a DAG, no single canonical path.
|
||||
- **Path grammar:** relative (`../`, `./`) resolved against `from_context`; absolute (`/`)
|
||||
against the space root; **normalization** (collapse `.`/`..`; case & separator handling) is a
|
||||
per-space **policy** knob (§10) — defaulting to case-preserving, `/`-separated.
|
||||
- **yawex page classes → shard roles / modes:** `local → canonical`, `global → cross-cutting`
|
||||
(a shard whose pages augment/overlay across namespaces), `virtual → projected/computed`. These
|
||||
are the INTENT shard modes (read-only · write-through · mirrored · projected · cached ·
|
||||
canonical), selected by policy, constrained by the capability profile.
|
||||
- Cross-shard name collisions are resolved by ADR-01's order + equivalence, not by a global
|
||||
flat namespace.
|
||||
|
||||
**Consequences.** The namespace is a navigable view over placements (a dimension of the union),
|
||||
not a storage layout. **Open:** per-space normalization policy defaults (case-sensitivity,
|
||||
unicode) — a policy preset, bundled per persona (O-8).
|
||||
|
||||
---
|
||||
|
||||
## ADR-03 — Union-level derived views: core vs adapter
|
||||
|
||||
**Status:** accepted · **CoreArchitectureBlueprint:** §8.4 (derived query index), §8.7
|
||||
(incremental) · **resolves findings open-Q #3**
|
||||
|
||||
**Context.** yawex shipped BackLinks, RecentChanges, AllPages, SiteMap, full-text Search.
|
||||
|
||||
**Decision.** Each is a **derived-tier** view (rebuildable, incrementally maintained); classify
|
||||
by where it is computed:
|
||||
|
||||
| View | Tier | Rationale |
|
||||
|------|------|-----------|
|
||||
| **BackLinks** | **core** | the link-graph over the union — squarely shard-wiki's concern (UC-18); the strongest core view |
|
||||
| **RecentChanges** | **core** | merge the coordination journal (§8.1) with shard change events (notify/poll, §8.8) across the union (UC-17) |
|
||||
| **AllPages / SiteMap** | **core** | an enumeration/projection of the union graph; cheap |
|
||||
| **Search (full-text)** | **hybrid** | **delegate** to a shard's native search where the native-query axis allows; **else** build a derived index over projections (the Logseq DataScript-over-files pattern, UC-19/UC-63) |
|
||||
|
||||
All are computed in the derived tier and carry provenance; **presentation is L6 (UI)**, never
|
||||
hard-coded in core (mechanism over policy, I-7).
|
||||
|
||||
**Consequences.** BackLinks/RecentChanges/AllPages/SiteMap are core deliverables; Search is a
|
||||
capability-gated delegate-or-derive. **Open:** cross-shard search ranking/relevance is a policy
|
||||
concern, not core mechanism.
|
||||
|
||||
---
|
||||
|
||||
## ADR-04 — Provenance & freshness model (concrete fields)
|
||||
|
||||
**Status:** accepted · **CoreArchitectureBlueprint:** §7.3 (layered provenance), §8.8 (freshness)
|
||||
|
||||
**Context.** yawex `Page::info` exposed modtime (and TODO'd last-editor / hits / edits). INTENT
|
||||
mandates explicit provenance & freshness (I-4).
|
||||
|
||||
**Decision.** Concretise the **provenance envelope** (layered: page-level + span-level deltas):
|
||||
|
||||
```
|
||||
ProvenanceEnvelope (page-level):
|
||||
source_shard # which shard this came from
|
||||
source_rev? # shard-native revision id, if the shard exposes one
|
||||
observed_at # when shard-wiki last read it
|
||||
liveness # static | captured-snapshot | live-over-files | view-time | irreducibly-live
|
||||
staleness_state # live | fresh | stale | unavailable
|
||||
authz_context # the L5 context under which it was read (no-leak, §9)
|
||||
overlay_state # none | draft | patch-pending | applied
|
||||
divergence[] # equivalence-set peers that differ (the chorus, ADR-01 step 4)
|
||||
derivation_lineage? # for derived/derivation-projection content (source → this view)
|
||||
Span-level: only the fields that differ from the page envelope (effective = page ⊕ delta).
|
||||
```
|
||||
|
||||
`freshness = (observed_at, source_rev?, staleness_state)`; `unavailable` is the dead-shard
|
||||
state (feeds ADR-03 RecentChanges and the union-under-unavailability open item O-11). This
|
||||
envelope is the data behind union-without-erasure and the input to conflict display.
|
||||
|
||||
**Consequences.** Provenance is queryable, layered (cheap per-span), and drives views/conflict
|
||||
UI. **Open / DROP:** hits/edits *analytics* are an L6/analytics concern, out of core (dropped
|
||||
from the yawex feature set).
|
||||
|
||||
---
|
||||
|
||||
## ADR-05 — Overlay / lightweight-patch model
|
||||
|
||||
**Status:** accepted · **CoreArchitectureBlueprint:** §8.2 (overlay engine), §8.1 (decision
|
||||
log), §8.6 (apply-under-drift)
|
||||
|
||||
**Context.** yawex had an append + threaded-comment workflow (edit a page without a full
|
||||
rewrite). INTENT mandates overlay-before-mutation (I-5).
|
||||
|
||||
**Decision.** The overlay lifecycle for any shard below write-through capability:
|
||||
|
||||
1. **Draft** — an edit becomes a draft recorded as a **coordination-canonical event** in the
|
||||
space's decision log (§8.1); the unapplied overlay is the **local truth**, fully attributed
|
||||
(`overlay_state`, ADR-04).
|
||||
2. **Patch / MR** — rendered in the shard's **native syntax (lossless)** or **Markdown
|
||||
(lossy-with-fidelity-report)** per the translation axis.
|
||||
3. **Apply** — only on explicit intent, only where profile + policy permit, with
|
||||
**apply-under-drift** semantics (§8.6: fast-forward / three-way / refuse-and-re-present).
|
||||
4. **Append / comment** = a **constrained overlay subtype** (purely additive, low-conflict) —
|
||||
the direct generalisation of yawex's workflow; safe even on the most limited shards.
|
||||
|
||||
This makes read-only, rate-limited, and lossy shards **editable** without silent remote
|
||||
mutation (graceful degradation, I-8; no silent mutation).
|
||||
|
||||
**Consequences.** One overlay mechanism spans drafts, patches, MRs, and comments; storage is the
|
||||
decision log (no separate store). **Open:** whether threaded-comment threading is a first-class
|
||||
overlay subtype or a generic structured-append — defer to implementation.
|
||||
|
||||
---
|
||||
|
||||
## ADR-06 — Markdown link semantics (wikilink + red-link)
|
||||
|
||||
**Status:** accepted · **CoreArchitectureBlueprint:** §7 (page model), ADR-01 (resolution) ·
|
||||
**resolves findings open-Q #2**
|
||||
|
||||
**Context.** yawex had CamelCase WikiLinks, `[[free links]]`, red-`?` links for nonexistent
|
||||
pages, `::` labels. INTENT mandates Markdown-first; TRANSFORM the *semantics*, drop the bespoke
|
||||
syntax.
|
||||
|
||||
**Decision.**
|
||||
- Adopt a **CommonMark wikilink extension**: `[[Target]]` / `[[Target|label]]`, resolved via
|
||||
ADR-01 (by identity/name across the union).
|
||||
- **Red-link** = a wikilink whose target resolves to FAILED (ADR-01 step 7): a valid,
|
||||
*createable* target with no page yet (the soft-create affordance, UC-23).
|
||||
- **CamelCase auto-linking is OFF by default** (a legacy affordance); opt-in **per space** for
|
||||
migrating CamelCase wikis (UseModWiki/c2 lineage, UC-25).
|
||||
- Links are stored **as text** (git-diffable; structure/links federate iff in-text).
|
||||
- **Boundary (resolves open-Q #2):** the link **model + resolution is core**; the **visual
|
||||
rendering** of wikilinks/red-links is a **reference-UI (L6)** concern. Core decides *what a
|
||||
link resolves to and whether it's a red-link*; the UI decides how it looks.
|
||||
|
||||
**Consequences.** Markdown-first, backend-neutral, resolution-unified with ADR-01. **Open:**
|
||||
cross-shard wikilink target disambiguation syntax ties to the portable-address scheme (O-6).
|
||||
|
||||
---
|
||||
|
||||
## Coverage & open questions
|
||||
|
||||
- **Findings open-Q #1** (per-page vs per-shard ACL) — answered by the settled access model:
|
||||
per-shard / per-namespace default, per-page ACL opt-in at L4 (`ArchitectureBlueprint.md`).
|
||||
- **Findings open-Q #2** (wikilink core vs UI) — ADR-06: model/resolution core, presentation UI.
|
||||
- **Findings open-Q #3** (derived views core vs adapter) — ADR-03.
|
||||
- Carried to CoreArchitectureBlueprint §12: cross-space addressing syntax (O-6), namespace
|
||||
normalization presets (O-8), cross-shard search ranking (policy), union-under-unavailability
|
||||
(O-11).
|
||||
|
||||
## Acceptance (SHARD-WP-0001)
|
||||
|
||||
Each task T1–T6 has an ADR (T1→ADR-01, T2→ADR-02, T3→ADR-03, T4→ADR-04, T5→ADR-05, T6→ADR-06);
|
||||
all honour INTENT (mechanism over policy, union without erasure, overlay before mutation,
|
||||
capability-aware adapters) and are consistent with `CoreArchitectureBlueprint.md`. The
|
||||
access-model thread is ratified (not deferred). The next implementation workplan (domain model /
|
||||
adapter contract) can proceed without unresolved yawex-derived design gaps.
|
||||
@@ -6,10 +6,15 @@ Background on document types: InfoTechPrimers on coulomb.social.
|
||||
|
||||
| File | Status | Role |
|
||||
|------|--------|------|
|
||||
| `CoreArchitectureBlueprint.md` | draft for review | **Whole-system architecture** — layers, abstractions, load-bearing decisions (synthesised from all research) |
|
||||
| `FederationArchitecture.md` | draft for review | federation design — *what the union does*: T1–T10 decision records + the federation-model taxonomy (SHARD-WP-0002) |
|
||||
| `WikiEngineCoreArchitecture.md` | draft for review | the native **headless, API-first wiki engine** — small page-store kernel + typed-extension framework, as a canonical-mode shard backend (SHARD-WP-0013) |
|
||||
| `adr/` | living | Architecture Decision Records (ADR-0001: engine activation via feature-control) |
|
||||
| `FederationRequirements.md` | draft for review | yawex-derived union/federation design notes — resolution, namespace, derived views, provenance, overlay, links (ADR-01…06; SHARD-WP-0001) |
|
||||
| `ProductRequirementsDocument.md` | draft scaffold | What the product must deliver |
|
||||
| `TechnicalSpecificationDocument.md` | draft scaffold | How the system is built |
|
||||
| `UseCaseCatalog.md` | draft | 25 use cases promoted from c2 + yawex research |
|
||||
| `ArchitectureBlueprint.md` | draft | Access, history, and identity architecture |
|
||||
| `TechnicalSpecificationDocument.md` | draft + §A | How the system is built; **§A = the normative shard adapter contract** (T11–T16, T18; SHARD-WP-0002) |
|
||||
| `UseCaseCatalog.md` | draft | 84 use cases promoted from c2 + yawex + ~23-system research |
|
||||
| `ArchitectureBlueprint.md` | draft | Access, history, and identity sub-blueprint (the L0–L4 authorization ladder; referenced by CoreArchitectureBlueprint §9) |
|
||||
|
||||
Promote material from `research/` and reviewed items from `demand/` into spec
|
||||
before treating it as implementation authority.
|
||||
@@ -33,7 +33,134 @@ per information space.
|
||||
|
||||
## 4. Architecture References
|
||||
|
||||
- `spec/ArchitectureBlueprint.md` — access, history, identity delegation
|
||||
- `spec/CoreArchitectureBlueprint.md` — whole-system architecture (layers, the dual narrow
|
||||
waist, the 15 capability spectra, projection, consistency); the normative source for §A.
|
||||
- `spec/FederationArchitecture.md` — federation design (*what the union does*); §A is its
|
||||
companion (*what a backend must expose*).
|
||||
- `spec/FederationRequirements.md` — yawex-derived ADRs (resolution, page model, overlay).
|
||||
- `spec/ArchitectureBlueprint.md` — access, history, identity delegation (L5).
|
||||
|
||||
---
|
||||
|
||||
# A. Shard Adapter Contract (normative)
|
||||
|
||||
Deliverable of **SHARD-WP-0002** (T11–T16, T18): the versioned interface a backend implements
|
||||
to participate as a shard, and the *normative* rules core relies on. It is the **bottom narrow
|
||||
waist** (blueprint §6); the page model is the top waist (§7). This section is normative where it
|
||||
says **MUST/SHOULD**; design rationale lives in the blueprint (cited, not restated).
|
||||
|
||||
## A.1 (T11) Capability model & versioned interface
|
||||
|
||||
- The contract is a **versioned interface** (`Foswiki::Store`/`Foswiki::Meta` is the proof a
|
||||
swappable-backend-behind-a-stable-interface works). A binding declares the contract version
|
||||
it implements; core checks compatibility at registration.
|
||||
- **Operation verbs:** `read, write, diff, merge, lock, version, publish, notify,
|
||||
transclude-source, translate-syntax, structured-payload, derive-projection, execute`. The
|
||||
last two are **gated, OFF by default** (A.6/T18). Verb support is part of the profile and MUST
|
||||
reconcile with the T10 federation-ops matrix.
|
||||
- **Capability profile = a position on each capability spectrum**, not a boolean checklist. The
|
||||
full set is the **fifteen spectra** (blueprint §6.1); operationally they reduce to a **small
|
||||
orthogonal core** (substrate, write-granularity, content-opacity, operational-envelope,
|
||||
access-grant, computational/liveness) with the rest **implied** via published rules that
|
||||
**forbid impossible profiles** (blueprint §6.5). Degradation reads only the **named
|
||||
axis-interaction subset** (blueprint §6.5 table) — that table *is* the degradation contract.
|
||||
- A profile MUST express **absence** cleanly (the Oddmuse floor); core never guesses a missing
|
||||
capability.
|
||||
|
||||
## A.2 (T11/§6.6) Conformance — profiles are verified, not asserted
|
||||
|
||||
- The contract ships a **versioned conformance suite**. A binding is **admissible only if it
|
||||
passes**: the suite exercises each declared verb and spectrum position and checks observed
|
||||
behaviour matches the claim (a `write` round-trips, `notify` actually fires, an opaque shard
|
||||
refuses plaintext query, implied-position rules hold). A lying/buggy profile is **rejected at
|
||||
registration**, not run in production (blueprint §6.6).
|
||||
- Conformance makes capability-as-data (I-3) and the §6.5 degradation contract **sound**.
|
||||
Mismatch is reported as a capability-by-capability diff; an adapter may register
|
||||
*degraded-but-honest* (drop the unsupported claim).
|
||||
|
||||
## A.3 (T14) Adapter binding — attachment-mode taxonomy
|
||||
|
||||
A backend MAY offer several modes; attach mode is a **per-binding, capability-gated choice**
|
||||
with one declared authoritative (blueprint §6.3):
|
||||
|
||||
- **file-store** (native vault/folder *or* an interchange/sync mirror) · **git-IS-store** (forge
|
||||
wikis & ikiwiki — git is store *and* journal; the home case) · **in-engine-hosted** (XWiki
|
||||
component, Obsidian/Logseq/Roam plugin, Trilium script) · **local-REST** (Joplin Data API,
|
||||
Trilium ETAPI) · **external-API-only** (Notion) · **direct-DB** (MojoMojo schema→model) ·
|
||||
**CRDT-replica** (Anytype/AFFiNE/AppFlowy) · **P2P / no-central-endpoint**.
|
||||
- **Backend-swap tolerance:** identity/provenance MUST survive a substrate change (bind to
|
||||
capabilities, not "it's files"). **Boundary:** an **image / live-memory blob is never an
|
||||
attach target** (I-12) — participation is export→files only.
|
||||
- (UC-38, UC-40, UC-43, UC-50, UC-53, UC-57, UC-60, UC-62, UC-64, UC-65, UC-76, UC-79, UC-81.)
|
||||
|
||||
## A.4 (T12) Page model — structured / computational payload
|
||||
|
||||
The backend-neutral, Markdown-first page model MUST carry, without lossy flattening, every
|
||||
shape in blueprint §7.1: prose; typed/computed records (incl. effective-vs-own computed
|
||||
metadata); typed-graph statements (Wikibase); inline-embedded objects (Quip/Notion);
|
||||
non-Markdown assets (typed asset / opaque blob / content-type registry); and the **four
|
||||
computational shapes** (one-source-many-projections UC-83; notebook-with-embedded-output UC-84;
|
||||
program-as-page; live/temporal). All reduce to `(content|source, structure, provenance
|
||||
envelope, [derivation rule])`. **Identity is a stable handle; placement is separate; equivalence
|
||||
is fingerprint-based** (blueprint §7.2, FederationRequirements ADR-01/02, I-9). The **provenance
|
||||
envelope is layered** (page + span deltas; effective = page ⊕ delta, §7.3). (UC-34, UC-39,
|
||||
UC-55, UC-58, UC-54, UC-44, UC-66, UC-67, UC-73, UC-83, UC-84.)
|
||||
|
||||
## A.5 (T13) History portability — adopt / supplement / import
|
||||
|
||||
Per the history axis: **adopt** git-native history as-is; **supplement** non-portable internal
|
||||
history (DB / Notion / Joplin / Trilium revisions, CRDT-log) — the journal begins-now / mirrors
|
||||
/ snapshots; **import** open file history (RCS, PlainFile, MojoMojo DB-version-rows) backfilled
|
||||
preserving author/timestamp. **Partial/truncated history MUST be reported honestly** ("history
|
||||
begins at", UC-24), never implied complete. Embedded-output documents (notebooks) use
|
||||
paired-text (Jupytext) / cell-aware merge (nbdime). The space's own coordination history is the
|
||||
event-sourced decision log (blueprint §8.1). (UC-36, UC-41, UC-24.)
|
||||
|
||||
## A.6 (T15) Syntax translation & content fidelity
|
||||
|
||||
Translation is a spectrum: **native → lossless → lossy-with-fidelity-report**. Read native
|
||||
markup (TML, XWiki syntax, HTML) into the page model; accept Markdown overlays back via
|
||||
**lossless bidirectional translation** where possible (Foswiki WysiwygPlugin is the proof). For
|
||||
non-round-tripping models (Notion blocks, Trilium HTML, CRDT, typed-graph, notebook JSON/MIME):
|
||||
translate lossily but **make the fidelity loss visible** — a per-shard/per-page report of what
|
||||
projects cleanly vs degrades, with non-mappable elements preserved as provenance/sidecar (I-4).
|
||||
Add a **structured-re-evaluable-value** point to the opacity spectrum (Wolfram expression).
|
||||
Where no acceptable translation exists, the shard is a **read-only/projection** participant
|
||||
(I-8), never silently corrupted. (UC-42, UC-59, UC-03, UC-73.)
|
||||
|
||||
## A.7 (T16) Addressing, identity & navigation
|
||||
|
||||
- **Span addressing:** adopt native span IDs where minted (Roam `:block/uid`, Logseq `id::`,
|
||||
Notion/CRDT UUID), shard-scoped so they survive projection and don't collide; else a
|
||||
position address (path+range) or content-fingerprint address. Portable tumbler is the ideal
|
||||
(blueprint O-6).
|
||||
- **Transclusion** = one reference-not-copy primitive over the addressable union (FederationArch
|
||||
T8). **Query/navigation:** delegate to a shard's native query where capable (Datalog, DB
|
||||
query, XWQL, SPARQL), **else build a derived index over the projection** (Logseq pattern);
|
||||
dimensional/query-defined views are derived-tier. (UC-44–48, UC-51, UC-52, UC-54, UC-63,
|
||||
UC-74.)
|
||||
|
||||
## A.8 (T18) Computational / executable content capability
|
||||
|
||||
**In scope as a page-model + projection concern; out of scope as an execution platform**
|
||||
(blueprint §8.5). Core **recognises** computational types, attaches the **canonical source**,
|
||||
and presents derived forms as **provenance- and liveness-marked projections**. `derive-
|
||||
projection`/`execute` are **gated capabilities, OFF by default**, carrying a **trust/sandbox**
|
||||
concern, **degrading to a captured snapshot / static render / recording**. One snapshot-
|
||||
provenance record (run id, source rev, timestamp, environment "unguaranteed") serves notebooks,
|
||||
renders, recordings. No INTENT amendment required. (UC-54, UC-55, UC-83, UC-84.)
|
||||
|
||||
## A.9 Conformance & module mapping
|
||||
|
||||
The contract maps to `src/shard_wiki/adapters/` (the bottom waist: `AdapterContract`,
|
||||
`CapabilityProfile`, attachment-mode binding, the conformance suite) consuming `model/`
|
||||
(page model + capability value types) and `provenance/` (blueprint §11). Each concrete adapter
|
||||
ships its declared profile + a conformance pass.
|
||||
|
||||
*Decided:* A.1–A.8 (versioned capability contract, verified conformance, binding taxonomy,
|
||||
page model, history portability, translation, addressing, gated computational content).
|
||||
*Deferred:* per-backend adapter specs (one per backend, later). *Open:* blueprint §12 O-items
|
||||
(addressing O-6, axis interactions O-5, span-authz O-10).
|
||||
|
||||
## 5. Integration Boundaries
|
||||
|
||||
@@ -47,10 +174,13 @@ Package scaffold only (`__version__`, smoke tests). Domain model not yet coded.
|
||||
|
||||
## 7. Use Cases
|
||||
|
||||
`spec/UseCaseCatalog.md` — 25 use cases (UC-01–UC-25) promoted from c2 wiki
|
||||
origins and yawex prior-art research.
|
||||
`spec/UseCaseCatalog.md` — 84 use cases (UC-01–UC-84) from c2/yawex origins,
|
||||
the wiki-engine + modern-tool + computational research, and the syntheses.
|
||||
|
||||
## 8. Next Specification Work
|
||||
|
||||
Outputs from `SHARD-WP-0001` tasks (page resolution, namespaces, derived views,
|
||||
provenance, overlays, link semantics) will be incorporated here as they complete.
|
||||
The design layer is complete: `SHARD-WP-0001` (→ `FederationRequirements.md`),
|
||||
`SHARD-WP-0002` (→ `FederationArchitecture.md` + §A above), and the hardened
|
||||
`CoreArchitectureBlueprint.md`. The next workplan is the **implementation** of the
|
||||
domain model + adapter contract (starting from §A and blueprint §11's module layout),
|
||||
likely with a first spike on the keystone (the event-sourced decision log + derived fold).
|
||||
File diff suppressed because it is too large
Load Diff
269
spec/WikiEngineCoreArchitecture.md
Normal file
269
spec/WikiEngineCoreArchitecture.md
Normal file
@@ -0,0 +1,269 @@
|
||||
# WikiEngineCoreArchitecture
|
||||
|
||||
Status: **draft for review** · Date: 2026-06-15 · Deliverable of **SHARD-WP-0013 T5**
|
||||
|
||||
The architecture of shard-wiki's **native reference wiki-engine**: a **headless, API-first**
|
||||
engine — a **small core** plus a **stringent typed-extension framework** — that addresses the
|
||||
whole use-case catalogue, mediates conflicting requirements into one integrated featureset, and
|
||||
lets each shard **activate only what it needs**. Authoritative as of the ratified INTENT
|
||||
amendment (2026-06-15, decision `84ffdb48`): the engine is **additive** and is shard-wiki's
|
||||
**reference first-party shard backend (a canonical-mode shard)** — not a replacement for other
|
||||
engines, not a UI.
|
||||
|
||||
Relation to other specs (referenced, not restated):
|
||||
- `CoreArchitectureBlueprint.md` — the orchestrator/whole-system architecture. **The engine is
|
||||
one shard behind §A; federation, union, projection, and cross-shard coordination are the
|
||||
orchestrator's job, not the engine's.** That is what keeps the engine small.
|
||||
- `TechnicalSpecificationDocument.md §A` — the shard adapter contract the engine implements.
|
||||
- `FederationRequirements.md` — page resolution, overlay, link semantics (ADRs the engine reuses).
|
||||
- `UseCaseCatalog.md` "Capability structure" layer (T2) — the core-vs-extension map + the
|
||||
conflict-mediation map this document realizes.
|
||||
- reuse surface (`capability.wiki.*`, plus consumed `feature-control` / `authorization`).
|
||||
|
||||
---
|
||||
|
||||
## 1. Thesis: a small page-store kernel; everything else is a typed extension
|
||||
|
||||
> **The engine is a page-store kernel with a typed-extension runtime. Every capability beyond
|
||||
> the c2-minimum is a *typed extension* a shard activates only if it needs it — and a shard's
|
||||
> externally-visible capability profile is *computed from its active extension set*.**
|
||||
|
||||
That single chain — **configuration (which extensions) → capability (what the shard can do) →
|
||||
conformance (verified)** — is the whole design. It mirrors the orchestrator's discipline
|
||||
(`CoreArchitectureBlueprint` §6.5: capability-as-data, verified, no per-backend code) and turns
|
||||
"integrated whole, yet activate only what you need" from a slogan into a mechanism.
|
||||
|
||||
The engine stays small for a structural reason: it is **one shard**, not a federation layer.
|
||||
Union, projection, equivalence, cross-shard overlay-orchestration, and the federation models all
|
||||
live in shard-wiki's orchestrator (the blueprint). The engine implements `ShardAdapter` (§A) and
|
||||
nothing above it. So "wiki engine" here means *a really good single canonical shard with a
|
||||
typed-extension framework and a headless agent-first API* — not a re-implementation of shard-wiki.
|
||||
|
||||
---
|
||||
|
||||
## 2. Engine invariants
|
||||
|
||||
| # | Invariant | Why |
|
||||
|---|-----------|-----|
|
||||
| E-1 | **One shard, not a federation layer.** The engine implements `ShardAdapter` (§A); union/projection/federation are the orchestrator's. | Keeps the engine small; no duplication of the blueprint. |
|
||||
| E-2 | **Small kernel.** The kernel is only: page store + history, the page model (reused), the extension runtime, the API. | Common case (a plain wiki) is trivial. |
|
||||
| E-3 | **Everything else is a typed extension.** No feature beyond the c2-minimum is baked into the kernel. | Integrated-whole-yet-selective; testable boundary. |
|
||||
| E-4 | **Per-shard activation.** A shard runs an *activation profile* (a set of extensions + config); unused features cost nothing. | "Activate only what you need." |
|
||||
| E-5 | **Capability profile is derived from active extensions.** The §A profile the engine declares is computed from its activation profile, then conformance-verified. | One source of truth; honest, verified capabilities. |
|
||||
| E-6 | **Headless & API-first.** The API is the only interface; no bundled UI/rendering (consumer concern, L6). | INTENT amendment; clean orchestrator/consumer split. |
|
||||
| E-7 | **Agent-first ergonomics.** The API is typed, introspectable, batchable, low-round-trip. | INTENT: optimized for efficient agent/automation access. |
|
||||
| E-8 | **Reuse over reinvent.** Page model, history/journal, activation, and authz are *consumed* (existing capabilities), not rebuilt. | Smallness; reuse-surface alignment. |
|
||||
| E-9 | **Extensions are typed & verified.** An extension declares its types/hooks/deps; activation is rejected if types conflict or deps are unmet (impossible profiles forbidden). | Stringency; mirrors §6.5 + conformance. |
|
||||
|
||||
---
|
||||
|
||||
## 3. The kernel (four concepts)
|
||||
|
||||
The kernel is deliberately four things — nothing more is mandatory.
|
||||
|
||||
1. **Page** — the backend-neutral page model (`capability.wiki.page-model`, reused as-is):
|
||||
stable identity ≠ placement, layered provenance, page shapes. The kernel does **not** redefine
|
||||
it; extensions may *register additional shapes/types* (§4).
|
||||
2. **Store + history** — a git-backed page store (the engine is the *git-IS-store* case from the
|
||||
blueprint): a write is a commit; history is native and recoverable (E-3/I-10). Coordination
|
||||
decisions reuse the event-sourced journal (`capability.wiki.coordination-journal`).
|
||||
3. **Extension runtime** — the typed-extension registry, hook dispatcher, type checker, and
|
||||
activation engine (§4). *This is the core innovation; it is the only “framework” in the kernel.*
|
||||
4. **API** — the headless, typed, agent-first surface (§7). Kernel endpoints cover the c2-minimum
|
||||
(page CRUD-as-history, links, history); extensions extend the surface through typed routes.
|
||||
|
||||
The **c2-minimum** a kernel-only shard delivers (no extensions): write a page, link pages
|
||||
(`[[wikilink]]` + red-link), never lose an edit. That is a complete, useful headless wiki.
|
||||
|
||||
---
|
||||
|
||||
## 4. The typed-extension model (the framework)
|
||||
|
||||
An **Extension** is a typed unit declaring a contract the runtime enforces:
|
||||
|
||||
```
|
||||
Extension:
|
||||
id : reverse-domain id (e.g. ext.struct.typed-records)
|
||||
provides : capability ids it realizes (reuse-surface; e.g. capability.wiki.page-model[typed])
|
||||
types : page shapes / field schemas / content-types it introduces (typed, validated)
|
||||
hooks : kernel lifecycle bindings it implements (see below)
|
||||
api : typed routes it adds to the headless surface
|
||||
depends_on : other extensions / consumed capabilities required
|
||||
conflicts_with: extensions it cannot co-activate with
|
||||
config : declared, schema-checked activation parameters
|
||||
```
|
||||
|
||||
**Hooks (the kernel lifecycle the runtime dispatches):**
|
||||
`on_resolve` (name→page), `on_read`, `on_write` (validate/transform a draft), `on_link`
|
||||
(link/transclusion resolution), `on_history`, `on_query`, `on_render_request` (produce a derived
|
||||
representation for a consumer), `on_profile` (contribute capability-spectrum positions, E-5).
|
||||
Hooks are **typed** (typed inputs/outputs) and dispatched in a **declared, deterministic order**.
|
||||
|
||||
**Typing & composition (stringency):**
|
||||
- At activation, the runtime builds the **dependency closure**, checks **type consistency** (no
|
||||
two active extensions claim incompatible types for the same page shape/field; `conflicts_with`
|
||||
honoured), and rejects an **impossible profile** — exactly the §6.5 implication-rule discipline,
|
||||
applied to extensions. A rejected profile fails fast at boot, never silently.
|
||||
- Composition is **deterministic**: hook order is declared; conflicts are resolved by explicit
|
||||
precedence or rejection, never by accident.
|
||||
- Extensions ship a **conformance check** (mirrors §6.6): an activated extension is exercised
|
||||
against its declared types/hooks before the shard serves traffic — *typed contracts verified,
|
||||
not trusted*.
|
||||
|
||||
**Per-shard activation (reuse, not reinvent):**
|
||||
- A shard's **activation profile** = `{extension id → config}`. Activation/evaluation **reuses
|
||||
`capability.feature-control.evaluate`** (helix_forge/feature-control) — shard-wiki does not
|
||||
build a bespoke flagging system (T3 consumption).
|
||||
- **E-5 in action:** the engine's `on_profile` hooks fold the active extensions into the §A
|
||||
**capability profile** the shard advertises to the orchestrator (e.g. activate
|
||||
`ext.struct.typed-records` → the `structure` spectrum rises and `structured-payload` is
|
||||
declared). The profile is then conformance-verified (§A.2). *Configuration → capability →
|
||||
conformance is one chain.*
|
||||
|
||||
---
|
||||
|
||||
## 5. Featureset map: core vs extensions, and conflict mediation
|
||||
|
||||
The engine realizes the T2 "Capability structure" layer (`UseCaseCatalog.md`). Mapping (the
|
||||
*page/content-level* clusters; **X-FED and X-ATT are orchestrator concerns, not engine
|
||||
extensions** — E-1):
|
||||
|
||||
| Engine kernel (always on) | T2 | reuse-surface |
|
||||
|---------------------------|----|---------------|
|
||||
| Page lifecycle, identity/placement, history, links, store | EC-1…EC-5 | `capability.wiki.page-model`, `…coordination-journal`, `…adapter-contract` |
|
||||
|
||||
| Built-in typed extension | T2 cluster | provides / consumes | default |
|
||||
|--------------------------|-----------|---------------------|---------|
|
||||
| `ext.overlay` | X-OVERLAY | `capability.wiki.overlay` | on (no-op locally) |
|
||||
| `ext.authz` (L0→L4 tiers) | X-AUTHZ | consumes `capability.authorization.policy-evaluate` | L0 |
|
||||
| `ext.views` (BackLinks/RecentChanges/…) | X-VIEW | `capability.wiki.derived-views` | BackLinks/RecentChanges on |
|
||||
| `ext.struct` (typed/computed/graph) | X-STRUCT | `capability.wiki.page-model[typed]` | off |
|
||||
| `ext.addr` (span addr / transclusion / query) | X-ADDR | `capability.wiki.page-model`+query | transclusion on |
|
||||
| `ext.compute` (literate/notebook/program/live) | X-COMP | `capability.wiki.engine-typed-extensions` | off (gated, sandbox) |
|
||||
| `ext.prov` (rich provenance/metadata) | X-PROV | `capability.wiki.page-model[provenance]` | base on |
|
||||
| `ext.collab` (c2 social patterns) | X-COLLAB | (UI/convention; mostly consumer) | off |
|
||||
|
||||
**Conflict mediation (T2 map) realized by the framework** — every tension is a *mechanism*, not a
|
||||
baked-in choice, so one featureset serves all:
|
||||
|
||||
| Tension | Realized by |
|
||||
|---------|-------------|
|
||||
| open vs governed | `ext.authz` tiers (additive); kernel history is the floor at L0 |
|
||||
| lossless vs lossy | a `translate` hook + fidelity report (consumes the proposed `capability.content.translation-fidelity`, G2) |
|
||||
| live vs snapshot | `ext.compute`/`ext.addr` mark liveness; degrade to snapshot (never imply live) |
|
||||
| canonical vs chorus | detection in kernel; resolution is a policy preset (orchestrator) |
|
||||
| integrated-whole vs only-what-you-need | **the activation profile** (E-4) + typed composition (§4) — the headline mediation |
|
||||
| minimal vs feature-rich | small kernel (§3) + extensions; nothing beyond c2 is mandatory |
|
||||
|
||||
---
|
||||
|
||||
## 6. The engine as a canonical-mode shard
|
||||
|
||||
The engine exposes itself through an `EngineShardAdapter` implementing §A:
|
||||
|
||||
- **Substrate** git-IS-store; **history** git-native; **write** = commit; `current_rev` = sha
|
||||
(apply-under-drift works out of the box). It is the **most capable shard** shard-wiki can
|
||||
attach — it dogfoods the contract.
|
||||
- Its **capability profile is computed from active extensions** (E-5) and **conformance-verified**
|
||||
(§A.2) — so the orchestrator sees an honest profile, and federation ops degrade by the engine's
|
||||
*actually-activated* capabilities.
|
||||
- The orchestrator attaches it like any shard; **federation/union/projection are not in the
|
||||
engine** (E-1). A standalone deployment is "the engine as the sole canonical shard"; a
|
||||
federated deployment is "the engine as one shard among many." Same engine, no re-architecture.
|
||||
|
||||
This is the precise realization of the INTENT reconciliation: shard-wiki orchestrates; the engine
|
||||
is the first-party shard it can attach.
|
||||
|
||||
---
|
||||
|
||||
## 7. Headless API surface & agent ergonomics (E-6/E-7)
|
||||
|
||||
API-first means the typed API is the product; there is no UI. Agent-first means it is designed
|
||||
for cheap, deterministic machine consumption:
|
||||
|
||||
- **Typed resource API** over pages, links, history, spans — content-negotiated (raw Markdown,
|
||||
the structured page model, or an extension-rendered representation via `on_render_request`).
|
||||
- **Capability/extension introspection** — an endpoint returns the shard's **active extensions,
|
||||
their types, and the derived §A capability profile**, so an agent can discover *what this shard
|
||||
can do* before acting (no trial-and-error). This is the agent-facing twin of E-5.
|
||||
- **Batch & query** — multi-page reads, link-graph and RecentChanges queries (via `ext.views`),
|
||||
and `on_query` delegation — minimizing round-trips.
|
||||
- **Write via overlay** — edits go through the overlay path (FederationRequirements ADR-05), so
|
||||
agent writes are safe (draft → apply-under-drift) and attributable.
|
||||
- **Deterministic & provenance-carrying** — every response carries the provenance envelope;
|
||||
identical inputs yield identical outputs (no hidden state) — friendly to caching agents.
|
||||
|
||||
---
|
||||
|
||||
## 8. Implementation sketch (module layout)
|
||||
|
||||
The engine lives under the shard-wiki package as a backend (it sits at L0/L1 — a shard behind the
|
||||
adapter; nothing in the orchestrator depends *up* on it):
|
||||
|
||||
```
|
||||
src/shard_wiki/engine/
|
||||
kernel.py # page store + history (git-IS-store), lifecycle; reuses model/, provenance/, coordination/
|
||||
extension.py # Extension contract, registry, typed hook dispatcher, type checker
|
||||
activation.py # activation profile; reuses capability.feature-control.evaluate
|
||||
profile.py # derive the §A CapabilityProfile from active extensions (E-5) + conformance
|
||||
api.py # headless, typed, agent-first surface (+ extension introspection)
|
||||
adapter.py # EngineShardAdapter implements adapters/ ShardAdapter (canonical-mode shard)
|
||||
extensions/ # built-ins: overlay/ authz/ views/ struct/ addr/ compute/ prov/ collab/
|
||||
```
|
||||
|
||||
Dependency rule: `engine/` consumes `model/`, `provenance/`, `coordination/`, `adapters/`
|
||||
(contract), `policy/`; it is consumed *only* via its `EngineShardAdapter` (the orchestrator
|
||||
attaches it as a shard). No orchestrator-tier (`union/`, `projection/`) import.
|
||||
|
||||
---
|
||||
|
||||
## 9. Reuse (what the engine consumes vs registers)
|
||||
|
||||
- **Consumes:** `capability.feature-control.evaluate` (activation), `capability.authorization.
|
||||
policy-evaluate` (`ext.authz`), the proposed `capability.content.translation-fidelity` (G2,
|
||||
lossy translation), and shard-wiki's own `capability.wiki.{page-model, coordination-journal,
|
||||
adapter-contract, overlay, derived-views}`.
|
||||
- **Registers / realizes:** `capability.wiki.engine-typed-extensions` (this document is its
|
||||
Discovery evidence — D2→D3 on ratification). The cross-cutting **typed-extension framework**
|
||||
pattern is proposed back to the reuse surface as **G1** (`capability.platform.typed-extension-
|
||||
framework`); this engine is its first instance.
|
||||
|
||||
---
|
||||
|
||||
## 10. Traceability
|
||||
|
||||
- **INTENT** — realizes the 2026-06-15 amendment (decision `84ffdb48`): headless, API-first,
|
||||
additive native engine = canonical-mode shard backend; honours all engine invariants and the
|
||||
orchestrator boundary (E-1).
|
||||
- **Use cases** — the kernel/extension split *is* the T2 "Capability structure" layer
|
||||
(`UseCaseCatalog.md`); every UC is either kernel (EC-1…EC-5) or a named extension; conflicts
|
||||
use the T2 mediation map (§5). The engine must ultimately cover UC-01–UC-84 (per-shard subsets).
|
||||
- **Architecture** — consistent with `CoreArchitectureBlueprint` (engine = canonical-mode shard,
|
||||
§6 contract, §7 page model, §8.1 journal) and `TechnicalSpecificationDocument §A` (the contract
|
||||
it implements). `FederationRequirements` ADR-05/06 supply overlay + link semantics.
|
||||
- **Reuse surface** — §9; G1/G2 proposals from SHARD-WP-0013 T3.
|
||||
|
||||
## 11. Decisions / deferred / open
|
||||
|
||||
**Decided:** small page-store kernel + typed-extension runtime (E-2/E-3); engine is one shard,
|
||||
not a federation layer (E-1); capability profile derived from active extensions (E-5); headless,
|
||||
API-first, agent-first (E-6/E-7); activation reuses `feature-control` (E-8); extensions are
|
||||
typed + conformance-verified (E-9).
|
||||
|
||||
**Deferred:** the concrete extension SDK/ABI and hook signatures; the API protocol (REST/GraphQL/
|
||||
MCP) — agent-first introspection is required, the wire format is an implementation spike; the
|
||||
built-in extensions' internal designs (each is a later workplan).
|
||||
|
||||
**Open (tracked):** does `ext.compute` ever execute in-process or strictly delegate/snapshot
|
||||
(ties blueprint §8.5 + trust/sandbox); is the typed-extension framework promoted to the
|
||||
reuse-surface platform capability (G1) and then *consumed* here rather than engine-owned;
|
||||
introspection granularity vs. leaking internal structure to agents.
|
||||
|
||||
## 12. Stability note
|
||||
|
||||
The **thesis (§1)** and **invariants (§2)** — especially *engine-is-one-shard* (E-1),
|
||||
*small-kernel/everything-else-typed-extension* (E-2/E-3), and *capability-profile-derived-from-
|
||||
extensions* (E-5) — are load-bearing. Changing them (e.g. moving federation into the engine, or
|
||||
baking a feature into the kernel) is an architectural change in the sense of INTENT's Stability
|
||||
Note and should be rare and deliberate. The headless/API-first posture is fixed by the ratified
|
||||
INTENT amendment.
|
||||
79
spec/adr/ADR-0001-engine-activation-via-feature-control.md
Normal file
79
spec/adr/ADR-0001-engine-activation-via-feature-control.md
Normal file
@@ -0,0 +1,79 @@
|
||||
# ADR-0001 — Engine extension activation via feature-control (OpenFeature)
|
||||
|
||||
Status: **Accepted** · Date: 2026-06-15 · Deciders: tegwick · Source: SHARD-WP-0013 follow-up
|
||||
(feature-control assessment)
|
||||
|
||||
> First repo-level ADR. (Note: `FederationRequirements.md` contains document-internal
|
||||
> "ADR-01…06" design notes — those are scoped to that spec; this `spec/adr/` series is the
|
||||
> repository's standalone architecture decision log, starting here.)
|
||||
|
||||
## Context
|
||||
|
||||
`WikiEngineCoreArchitecture.md` (SHARD-WP-0013 T5) defines the engine as a small kernel plus a
|
||||
**typed-extension framework** where each shard **activates only the extensions it needs**
|
||||
(invariants E-4 activation, E-8 reuse-not-reinvent). It needs a mechanism to decide, per shard
|
||||
(and per tenant/context), which extensions/features are active — without baking a bespoke flag
|
||||
system into the engine, and without breaking the **standalone, zero-external-dependency** L0
|
||||
posture shard-wiki guarantees.
|
||||
|
||||
The helix_forge sibling **`feature-control`** (`capability.feature-control.evaluate`, registered
|
||||
at **D5 / A4 / C3 / R3**) provides exactly this: an **OpenFeature**-based feature-availability
|
||||
control plane with a working SDK (`feature_control_sdk`: `FeatureControlClient`, `Resolver`, a
|
||||
static `LocalProvider`), context-scoped evaluation (`tenant_id`/scope), explainable decisions,
|
||||
and graceful degradation when OpenFeature is absent. shard-wiki already proposed this as a T3
|
||||
*consumption* (reuse, don't rebuild).
|
||||
|
||||
## Decision
|
||||
|
||||
**Adopt `feature-control` (via the OpenFeature standard) as the engine's per-shard extension/
|
||||
feature activation mechanism** — *availability only* — with these constraints:
|
||||
|
||||
1. **OpenFeature-shaped, provider-pluggable.** The engine evaluates activation through an
|
||||
OpenFeature-style client. A static **`LocalProvider`** is the **standalone/L0 default**
|
||||
(zero external dependency); a `feature-control`/remote provider is plugged in for governed
|
||||
deployments. This mirrors shard-wiki's existing **identity-provider ladder** (null/local
|
||||
default → external when present).
|
||||
2. **Availability ≠ authorization.** feature-control decides *which extensions are active*,
|
||||
never *who may read/write*. Authorization stays in core (X-AUTHZ / `authorization.policy-
|
||||
evaluate`). The two are composed but never conflated. (feature-control's own INTENT requires
|
||||
this.)
|
||||
3. **Engine layer, not the orchestrator foundation.** Integration lives in
|
||||
`engine/activation.py`; the current `src/shard_wiki/` core stays dependency-free. OpenFeature/
|
||||
feature-control is an optional extra, kept out of the standalone path by the `LocalProvider`.
|
||||
4. **Thin slice only.** Consume `feature-control.evaluate` (mature, A4). Do **not** take a
|
||||
dependency on the heavier control-plane governance / `rollout` / `visibility` (A2) until a
|
||||
concrete need appears.
|
||||
|
||||
Activation keys = extension ids; evaluation context = `{tenant_id: root-entity, shard_id, …}`;
|
||||
the resulting active-extension set then **derives** the shard's §A capability profile (E-5).
|
||||
|
||||
## Consequences
|
||||
|
||||
**Positive**
|
||||
- No bespoke flag system; reuses a mature (D5/A4) capability — reuse-surface aligned.
|
||||
- Standalone stays zero-dep (LocalProvider); governed deployments get real runtime control,
|
||||
multi-tenant scoping, and **explainable** decisions that feed the engine's agent-introspection
|
||||
API (E-7: "why is extension X off for this shard?").
|
||||
- "Activate only what you need" + compute control become first-class and reversible at runtime.
|
||||
- Clean layering: availability (feature-control) vs authorization (core) vs identity (provider).
|
||||
|
||||
**Negative / risks (mitigated by the constraints)**
|
||||
- An optional OpenFeature dependency at the engine layer (mitigated: out of the standalone path).
|
||||
- Coupling to an external control plane in governed mode (mitigated: provider-pluggable, degrade
|
||||
to LocalProvider).
|
||||
- Temptation to route authz through it (mitigated: constraint 2, hard boundary).
|
||||
|
||||
## Alternatives considered
|
||||
|
||||
- **Bespoke per-shard flag/config in the engine** — rejected: reinvents feature-control, no
|
||||
standard, no multi-tenant/explainability, violates reuse-not-reinvent (E-8).
|
||||
- **No activation (all extensions always on)** — rejected: defeats "small core + activate only
|
||||
what you need" (E-2/E-4) and the compute-control goal.
|
||||
- **Build on the heavier feature-control control-plane now** — deferred: over-scoping a single
|
||||
engine's activation; revisit if rollout/governance needs emerge.
|
||||
|
||||
## Related
|
||||
|
||||
`WikiEngineCoreArchitecture.md` (E-4/E-8, §4 activation), `UseCaseCatalog.md` capability-structure
|
||||
layer (X-AUTHZ vs activation), `history/260615-reuse-surface-contributions.md` (T3 consumption),
|
||||
reuse-surface `capability.feature-control.evaluate`, INTENT amendment decision `84ffdb48`.
|
||||
11
spec/adr/README.md
Normal file
11
spec/adr/README.md
Normal file
@@ -0,0 +1,11 @@
|
||||
# spec/adr/ — Architecture Decision Records
|
||||
|
||||
Repository-level ADRs: one decision per file, `ADR-NNNN-<slug>.md`, status
|
||||
**Proposed / Accepted / Superseded**. Each records Context · Decision · Consequences ·
|
||||
Alternatives. These are the standalone, numbered decision log; design-note "ADRs" embedded
|
||||
inside a spec (e.g. `FederationRequirements.md` ADR-01…06) are scoped to that document and are
|
||||
not part of this series.
|
||||
|
||||
| ADR | Status | Subject |
|
||||
|-----|--------|---------|
|
||||
| [ADR-0001](ADR-0001-engine-activation-via-feature-control.md) | Accepted | Engine extension activation via feature-control (OpenFeature), availability-only, LocalProvider standalone default |
|
||||
@@ -1,10 +1,16 @@
|
||||
"""shard-wiki — Git-based Markdown wiki orchestrator and federation layer.
|
||||
|
||||
See INTENT.md for the authoritative specification of scope and boundaries.
|
||||
This package orchestrates wiki-shaped content across heterogeneous *shards*;
|
||||
it is not itself a wiki engine.
|
||||
See INTENT.md for the authoritative specification of scope and boundaries, and
|
||||
spec/CoreArchitectureBlueprint.md for the architecture. This package orchestrates
|
||||
wiki-shaped content across heterogeneous *shards*; it is not itself a wiki engine.
|
||||
|
||||
Foundation slice (SHARD-WP-0007): attach folder shard(s) to an
|
||||
:class:`~shard_wiki.space.InformationSpace`, resolve a name through the union, and
|
||||
read a page with layered provenance (chorus on ambiguity).
|
||||
"""
|
||||
|
||||
from shard_wiki.space import InformationSpace
|
||||
|
||||
__version__ = "0.0.0"
|
||||
|
||||
__all__ = ["__version__"]
|
||||
__all__ = ["__version__", "InformationSpace"]
|
||||
|
||||
25
src/shard_wiki/adapters/__init__.py
Normal file
25
src/shard_wiki/adapters/__init__.py
Normal file
@@ -0,0 +1,25 @@
|
||||
"""adapters/ — the shard adapter contract (bottom waist) and concrete adapters."""
|
||||
|
||||
from shard_wiki.adapters.conformance import (
|
||||
Check,
|
||||
ConformanceError,
|
||||
ConformanceReport,
|
||||
assert_conformant,
|
||||
run_conformance,
|
||||
)
|
||||
from shard_wiki.adapters.contract import CONTRACT_VERSION, ShardAdapter
|
||||
from shard_wiki.adapters.folder import FolderAdapter
|
||||
from shard_wiki.adapters.git import GitShardAdapter, PageRevision
|
||||
|
||||
__all__ = [
|
||||
"ShardAdapter",
|
||||
"FolderAdapter",
|
||||
"GitShardAdapter",
|
||||
"PageRevision",
|
||||
"CONTRACT_VERSION",
|
||||
"Check",
|
||||
"ConformanceReport",
|
||||
"ConformanceError",
|
||||
"run_conformance",
|
||||
"assert_conformant",
|
||||
]
|
||||
142
src/shard_wiki/adapters/conformance.py
Normal file
142
src/shard_wiki/adapters/conformance.py
Normal file
@@ -0,0 +1,142 @@
|
||||
"""Adapter conformance — profiles are verified, not self-asserted (TSD §A.2, blueprint §6.6).
|
||||
|
||||
Capability-as-data (I-3) is only sound if a binding's *declared* profile matches its *observed*
|
||||
behaviour. This battery exercises a binding and reports, check by check, whether claim == reality;
|
||||
``assert_conformant`` gates registration. A lying/buggy profile fails here instead of silently
|
||||
poisoning degradation decisions downstream.
|
||||
|
||||
This slice verifies the read path + honest absence of unclaimed verbs. Positive probes for
|
||||
claimed write/diff/merge are deferred (they mutate) to a later workplan.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
from dataclasses import dataclass
|
||||
|
||||
from shard_wiki.adapters.contract import ShardAdapter
|
||||
from shard_wiki.model import NotSupported, Verb
|
||||
|
||||
__all__ = ["Check", "ConformanceReport", "ConformanceError", "run_conformance", "assert_conformant"]
|
||||
|
||||
# Optional verbs whose *absence* must be honest (calling an unclaimed one raises NotSupported).
|
||||
_HONEST_ABSENCE_VERBS = (Verb.WRITE, Verb.DIFF, Verb.NOTIFY)
|
||||
_PROBE = {
|
||||
Verb.WRITE: lambda a: a.write("__probe__", ""),
|
||||
Verb.DIFF: lambda a: a.diff("__probe__", "__probe2__"),
|
||||
Verb.NOTIFY: lambda a: a.notify(),
|
||||
}
|
||||
|
||||
|
||||
@dataclass(frozen=True, slots=True)
|
||||
class Check:
|
||||
name: str
|
||||
ok: bool
|
||||
detail: str = ""
|
||||
|
||||
|
||||
@dataclass(frozen=True, slots=True)
|
||||
class ConformanceReport:
|
||||
adapter: str
|
||||
checks: tuple[Check, ...]
|
||||
|
||||
@property
|
||||
def ok(self) -> bool:
|
||||
return all(c.ok for c in self.checks)
|
||||
|
||||
@property
|
||||
def failures(self) -> tuple[Check, ...]:
|
||||
return tuple(c for c in self.checks if not c.ok)
|
||||
|
||||
def diff(self) -> str:
|
||||
return "; ".join(f"{c.name}: {c.detail}" for c in self.failures) or "conformant"
|
||||
|
||||
|
||||
class ConformanceError(Exception):
|
||||
def __init__(self, report: ConformanceReport) -> None:
|
||||
super().__init__(f"{report.adapter} not conformant — {report.diff()}")
|
||||
self.report = report
|
||||
|
||||
|
||||
def _safe(fn, name: str, ok_detail: str = "") -> Check:
|
||||
try:
|
||||
return fn()
|
||||
except Exception as exc: # noqa: BLE001 — a check must never crash the battery
|
||||
return Check(name, False, f"unexpected error: {exc!r}")
|
||||
|
||||
|
||||
def run_conformance(adapter: ShardAdapter) -> ConformanceReport:
|
||||
checks: list[Check] = []
|
||||
profile = None
|
||||
|
||||
def _profile_validates() -> Check:
|
||||
nonlocal profile
|
||||
profile = adapter.profile()
|
||||
profile.validate()
|
||||
return Check("profile-validates", True)
|
||||
|
||||
checks.append(_safe(_profile_validates, "profile-validates"))
|
||||
if profile is None: # profile() or validate() failed; can't probe further meaningfully
|
||||
return ConformanceReport(type(adapter).__name__, tuple(checks))
|
||||
|
||||
# READ is the capability floor.
|
||||
checks.append(Check("supports-read", profile.supports(Verb.READ),
|
||||
"" if profile.supports(Verb.READ) else "READ not declared"))
|
||||
|
||||
# READ round-trips: a declared-readable shard must actually read its own keys.
|
||||
def _read_round_trips() -> Check:
|
||||
keys = list(adapter.keys())
|
||||
if not keys:
|
||||
return Check("read-round-trips", True, "empty shard")
|
||||
page = adapter.read(keys[0])
|
||||
if page.identity.shard != adapter.shard_id:
|
||||
return Check("read-round-trips", False,
|
||||
f"page shard {page.identity.shard!r} != {adapter.shard_id!r}")
|
||||
if not isinstance(page.body, str):
|
||||
return Check("read-round-trips", False, "body is not text")
|
||||
return Check("read-round-trips", True)
|
||||
|
||||
if profile.supports(Verb.READ):
|
||||
checks.append(_safe(_read_round_trips, "read-round-trips"))
|
||||
|
||||
# WRITE positive probe: a claimed-writable shard must actually round-trip a write. The probe
|
||||
# is content-preserving (rewrite an existing page with its own body) so it is non-destructive.
|
||||
def _write_round_trips() -> Check:
|
||||
keys = list(adapter.keys())
|
||||
if not keys:
|
||||
return Check("write-round-trips", True, "empty shard")
|
||||
k = keys[0]
|
||||
original = adapter.read(k).body
|
||||
adapter.write(k, original)
|
||||
if adapter.read(k).body != original:
|
||||
return Check("write-round-trips", False, "rewrite did not preserve body")
|
||||
return Check("write-round-trips", True)
|
||||
|
||||
if profile.supports(Verb.WRITE):
|
||||
checks.append(_safe(_write_round_trips, "write-round-trips"))
|
||||
|
||||
# Honest absence: an *unclaimed* optional verb must raise NotSupported when invoked.
|
||||
for verb in _HONEST_ABSENCE_VERBS:
|
||||
if profile.supports(verb):
|
||||
continue # claimed → positive probe deferred (would mutate)
|
||||
name = f"honest-absence:{verb.value}"
|
||||
|
||||
def _probe(v=verb, n=name) -> Check:
|
||||
try:
|
||||
_PROBE[v](adapter)
|
||||
except NotSupported:
|
||||
return Check(n, True)
|
||||
except Exception as exc: # noqa: BLE001
|
||||
return Check(n, False, f"raised {type(exc).__name__}, expected NotSupported")
|
||||
return Check(n, False, "did not raise NotSupported though verb is unclaimed")
|
||||
|
||||
checks.append(_probe())
|
||||
|
||||
return ConformanceReport(type(adapter).__name__, tuple(checks))
|
||||
|
||||
|
||||
def assert_conformant(adapter: ShardAdapter) -> ConformanceReport:
|
||||
"""Run the battery; raise :class:`ConformanceError` if any check fails. Returns the report."""
|
||||
report = run_conformance(adapter)
|
||||
if not report.ok:
|
||||
raise ConformanceError(report)
|
||||
return report
|
||||
52
src/shard_wiki/adapters/contract.py
Normal file
52
src/shard_wiki/adapters/contract.py
Normal file
@@ -0,0 +1,52 @@
|
||||
"""The shard adapter contract — the bottom narrow waist (CoreArchitectureBlueprint §6, TSD §A).
|
||||
|
||||
A backend participates by implementing :class:`ShardAdapter`. ``shard_id``, ``profile`` and
|
||||
``read`` are mandatory; everything else is an optional capability that defaults to raising
|
||||
:class:`~shard_wiki.model.NotSupported` — so a limited backend is honest about what it can't do
|
||||
(graceful degradation, I-8) and core never assumes a capability it wasn't given (capability-as-
|
||||
data, I-3). Declared profiles are verified by the conformance suite (T4), never taken on trust.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import abc
|
||||
from collections.abc import Iterable
|
||||
|
||||
from shard_wiki.model import CapabilityProfile, NotSupported, Page
|
||||
|
||||
__all__ = ["ShardAdapter", "CONTRACT_VERSION"]
|
||||
|
||||
CONTRACT_VERSION = "0.1"
|
||||
|
||||
|
||||
class ShardAdapter(abc.ABC):
|
||||
"""Versioned interface a backend implements to attach as a shard."""
|
||||
|
||||
contract_version: str = CONTRACT_VERSION
|
||||
|
||||
@property
|
||||
@abc.abstractmethod
|
||||
def shard_id(self) -> str:
|
||||
"""Stable id scoping every Identity this shard mints."""
|
||||
|
||||
@abc.abstractmethod
|
||||
def profile(self) -> CapabilityProfile:
|
||||
"""The (to-be-verified) capability profile of this binding."""
|
||||
|
||||
@abc.abstractmethod
|
||||
def keys(self) -> Iterable[str]:
|
||||
"""The stable page keys this shard offers (the handle half of Identity)."""
|
||||
|
||||
@abc.abstractmethod
|
||||
def read(self, key: str) -> Page:
|
||||
"""Read one page by its stable key. Raises ``KeyError`` if absent."""
|
||||
|
||||
# --- optional capability verbs: honest NotSupported by default ---
|
||||
def write(self, key: str, body: str) -> Page: # noqa: ARG002
|
||||
raise NotSupported(f"{type(self).__name__} does not support write")
|
||||
|
||||
def diff(self, key: str, other: str) -> str: # noqa: ARG002
|
||||
raise NotSupported(f"{type(self).__name__} does not support diff")
|
||||
|
||||
def notify(self):
|
||||
raise NotSupported(f"{type(self).__name__} does not support notify")
|
||||
111
src/shard_wiki/adapters/folder.py
Normal file
111
src/shard_wiki/adapters/folder.py
Normal file
@@ -0,0 +1,111 @@
|
||||
"""FolderAdapter — a read-only file-store shard over a directory of Markdown.
|
||||
|
||||
The home-case substrate: a plain folder of ``.md`` files. The relative path (sans extension,
|
||||
``/``-separated) is the stable page **key**; the file is the page **body**; mtime gives a
|
||||
freshness stamp. Read-only in this slice (overlay/write-through come later); declared profile
|
||||
reflects exactly that (read-only, file-store, path addressing, no native history/query).
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
from collections.abc import Iterable
|
||||
from datetime import datetime, timezone
|
||||
from pathlib import Path
|
||||
|
||||
from shard_wiki.adapters.contract import ShardAdapter
|
||||
from shard_wiki.model import (
|
||||
AccessGrant,
|
||||
Addressing,
|
||||
AttachmentMode,
|
||||
CapabilityProfile,
|
||||
ContentOpacity,
|
||||
History,
|
||||
Identity,
|
||||
MergeModel,
|
||||
NativeQuery,
|
||||
NotSupported,
|
||||
OperationalEnvelope,
|
||||
Page,
|
||||
Placement,
|
||||
Substrate,
|
||||
Translation,
|
||||
Verb,
|
||||
WriteGranularity,
|
||||
)
|
||||
from shard_wiki.provenance import Liveness, ProvenanceEnvelope, Staleness
|
||||
|
||||
__all__ = ["FolderAdapter"]
|
||||
|
||||
|
||||
class FolderAdapter(ShardAdapter):
|
||||
def __init__(self, shard_id: str, root: str | Path, writable: bool = False) -> None:
|
||||
self._shard_id = shard_id
|
||||
self._root = Path(root)
|
||||
self._writable = writable
|
||||
|
||||
@property
|
||||
def shard_id(self) -> str:
|
||||
return self._shard_id
|
||||
|
||||
def profile(self) -> CapabilityProfile:
|
||||
verbs = {Verb.READ, Verb.WRITE} if self._writable else {Verb.READ}
|
||||
granularity = WriteGranularity.PER_PAGE if self._writable else WriteGranularity.NONE
|
||||
return CapabilityProfile(
|
||||
substrate=Substrate.FILES,
|
||||
attachment_mode=AttachmentMode.FILE_STORE,
|
||||
write_granularity=granularity,
|
||||
content_opacity=ContentOpacity.TRANSPARENT,
|
||||
operational_envelope=OperationalEnvelope.LOCAL_UNBOUNDED,
|
||||
access_grant=AccessGrant.OPEN,
|
||||
liveness=Liveness.STATIC,
|
||||
history=History.NONE,
|
||||
merge_model=MergeModel.NONE,
|
||||
addressing=Addressing.PATH,
|
||||
native_query=NativeQuery.NONE,
|
||||
translation=Translation.NATIVE,
|
||||
supported_verbs=frozenset(verbs),
|
||||
).validate()
|
||||
|
||||
def _path_for(self, key: str) -> Path:
|
||||
return self._root / f"{key}.md"
|
||||
|
||||
def current_rev(self, key: str) -> str | None:
|
||||
"""The shard's current revision token for ``key`` (mtime iso), or ``None`` if absent.
|
||||
Used for apply-under-drift comparison (blueprint §8.6)."""
|
||||
path = self._path_for(key)
|
||||
if not path.is_file():
|
||||
return None
|
||||
return datetime.fromtimestamp(path.stat().st_mtime, tz=timezone.utc).isoformat()
|
||||
|
||||
def write(self, key: str, body: str) -> Page:
|
||||
if not self._writable:
|
||||
raise NotSupported(f"{type(self).__name__} is read-only")
|
||||
path = self._path_for(key)
|
||||
path.parent.mkdir(parents=True, exist_ok=True)
|
||||
path.write_text(body, encoding="utf-8")
|
||||
return self.read(key)
|
||||
|
||||
def keys(self) -> Iterable[str]:
|
||||
for p in sorted(self._root.rglob("*.md")):
|
||||
yield p.relative_to(self._root).with_suffix("").as_posix()
|
||||
|
||||
def read(self, key: str) -> Page:
|
||||
path = self._path_for(key)
|
||||
if not path.is_file():
|
||||
raise KeyError(key)
|
||||
body = path.read_text(encoding="utf-8")
|
||||
mtime = datetime.fromtimestamp(path.stat().st_mtime, tz=timezone.utc)
|
||||
envelope = ProvenanceEnvelope(
|
||||
source_shard=self._shard_id,
|
||||
liveness=Liveness.STATIC,
|
||||
staleness=Staleness.FRESH,
|
||||
source_rev=mtime.isoformat(),
|
||||
observed_at=datetime.now(tz=timezone.utc),
|
||||
)
|
||||
rel = path.relative_to(self._root).as_posix()
|
||||
return Page(
|
||||
identity=Identity(self._shard_id, key),
|
||||
body=body,
|
||||
envelope=envelope,
|
||||
placements=(Placement(self._shard_id, rel),),
|
||||
)
|
||||
180
src/shard_wiki/adapters/git.py
Normal file
180
src/shard_wiki/adapters/git.py
Normal file
@@ -0,0 +1,180 @@
|
||||
"""GitShardAdapter — a second substrate: git-as-store (SHARD-WP-0012; TSD §A.3 git-IS-store).
|
||||
|
||||
The home case where **git is the store *and* the journal**. Tracked ``*.md`` paths are the page
|
||||
keys; the working-tree file is the body; a page's ``source_rev`` is the **commit sha of the last
|
||||
commit touching its path** (per-path, so an edit to one page never drifts another). The declared
|
||||
profile is *git-IS-store ⟹ substrate=git ∧ history=git-native* — the implication rule the
|
||||
capability model enforces (§6.5), validated at registration like any other binding.
|
||||
|
||||
This adapter adds **no core changes**: it implements the same :class:`ShardAdapter` contract the
|
||||
folder adapter does, proving "write an adapter + declare a verified profile" is the whole cost of a
|
||||
new substrate (capability-as-data, I-3). Built on the ``git`` CLI via subprocess — zero new deps.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import os
|
||||
import subprocess
|
||||
from collections.abc import Iterable
|
||||
from dataclasses import dataclass
|
||||
from pathlib import Path
|
||||
|
||||
from shard_wiki.adapters.contract import ShardAdapter
|
||||
from shard_wiki.model import (
|
||||
AccessGrant,
|
||||
Addressing,
|
||||
AttachmentMode,
|
||||
CapabilityProfile,
|
||||
ContentOpacity,
|
||||
History,
|
||||
Identity,
|
||||
MergeModel,
|
||||
NativeQuery,
|
||||
NotSupported,
|
||||
OperationalEnvelope,
|
||||
Page,
|
||||
Placement,
|
||||
Substrate,
|
||||
Translation,
|
||||
Verb,
|
||||
WriteGranularity,
|
||||
)
|
||||
from shard_wiki.provenance import Liveness, ProvenanceEnvelope, Staleness
|
||||
|
||||
__all__ = ["GitShardAdapter", "PageRevision"]
|
||||
|
||||
|
||||
@dataclass(frozen=True, slots=True)
|
||||
class PageRevision:
|
||||
"""One adopted git-native revision of a page: the commit sha and its subject line."""
|
||||
|
||||
sha: str
|
||||
message: str
|
||||
|
||||
_GIT_IDENTITY = {
|
||||
"GIT_AUTHOR_NAME": "shard-wiki",
|
||||
"GIT_AUTHOR_EMAIL": "shard@shard-wiki",
|
||||
"GIT_COMMITTER_NAME": "shard-wiki",
|
||||
"GIT_COMMITTER_EMAIL": "shard@shard-wiki",
|
||||
}
|
||||
|
||||
|
||||
class GitShardAdapter(ShardAdapter):
|
||||
"""A shard whose store is a git repo: keys are tracked ``*.md`` paths, revs are commit shas."""
|
||||
|
||||
def __init__(self, shard_id: str, repo_path: str | Path, writable: bool = False) -> None:
|
||||
self._shard_id = shard_id
|
||||
self._repo = Path(repo_path)
|
||||
self._writable = writable
|
||||
self._repo.mkdir(parents=True, exist_ok=True)
|
||||
if not (self._repo / ".git").exists():
|
||||
self._git("init", "--quiet")
|
||||
|
||||
@property
|
||||
def shard_id(self) -> str:
|
||||
return self._shard_id
|
||||
|
||||
def profile(self) -> CapabilityProfile:
|
||||
# VERSION is always available — a git-IS-store has git-native history to adopt (§A.5),
|
||||
# read-only or not. WRITE (= commit, PER_PAGE) is added only in writable mode.
|
||||
verbs = {Verb.READ, Verb.VERSION}
|
||||
granularity = WriteGranularity.NONE
|
||||
if self._writable:
|
||||
verbs |= {Verb.WRITE}
|
||||
granularity = WriteGranularity.PER_PAGE
|
||||
return CapabilityProfile(
|
||||
substrate=Substrate.GIT,
|
||||
attachment_mode=AttachmentMode.GIT_IS_STORE,
|
||||
write_granularity=granularity,
|
||||
content_opacity=ContentOpacity.TRANSPARENT,
|
||||
operational_envelope=OperationalEnvelope.LOCAL_UNBOUNDED,
|
||||
access_grant=AccessGrant.OPEN,
|
||||
liveness=Liveness.STATIC,
|
||||
history=History.GIT_NATIVE, # git-is-store ⟹ git-native (§6.5)
|
||||
merge_model=MergeModel.GIT_TEXT,
|
||||
addressing=Addressing.PATH,
|
||||
native_query=NativeQuery.NONE,
|
||||
translation=Translation.NATIVE,
|
||||
supported_verbs=frozenset(verbs),
|
||||
).validate()
|
||||
|
||||
def write(self, key: str, body: str) -> Page:
|
||||
"""Write = **commit**: stage the file and commit it (skip a no-op so no empty commit),
|
||||
returning the page at the new sha. Drift detection rides on ``current_rev`` = that sha."""
|
||||
if not self._writable:
|
||||
raise NotSupported(f"{type(self).__name__} is read-only")
|
||||
rel = f"{key}.md"
|
||||
path = self._path_for(key)
|
||||
path.parent.mkdir(parents=True, exist_ok=True)
|
||||
path.write_text(body, encoding="utf-8")
|
||||
self._git("add", "--", rel)
|
||||
if self._run("diff", "--cached", "--quiet").returncode != 0: # staged changes present
|
||||
self._git("commit", "-m", f"write {rel}", env=_GIT_IDENTITY)
|
||||
return self.read(key)
|
||||
|
||||
def keys(self) -> Iterable[str]:
|
||||
out = self._git("ls-files", "*.md").decode()
|
||||
for line in out.splitlines():
|
||||
yield line[: -len(".md")] if line.endswith(".md") else line
|
||||
|
||||
def read(self, key: str) -> Page:
|
||||
path = self._path_for(key)
|
||||
if not path.is_file():
|
||||
raise KeyError(key)
|
||||
rev = self.current_rev(key)
|
||||
return Page(
|
||||
identity=Identity(self._shard_id, key),
|
||||
body=path.read_text(encoding="utf-8"),
|
||||
envelope=ProvenanceEnvelope(
|
||||
source_shard=self._shard_id,
|
||||
liveness=Liveness.STATIC,
|
||||
staleness=Staleness.FRESH,
|
||||
source_rev=rev,
|
||||
lineage="git-native",
|
||||
),
|
||||
placements=(Placement(self._shard_id, f"{key}.md"),),
|
||||
)
|
||||
|
||||
def current_rev(self, key: str) -> str | None:
|
||||
"""The sha of the last commit touching ``key``'s path (per-path drift token), or None."""
|
||||
rel = f"{key}.md"
|
||||
if not self._path_for(key).is_file():
|
||||
return None
|
||||
sha = self._git("log", "-1", "--format=%H", "--", rel).decode().strip()
|
||||
return sha or None
|
||||
|
||||
def history(self, key: str) -> tuple[PageRevision, ...]:
|
||||
"""Adopt git-native history (§A.5): the commit list for ``key``'s path, newest-first.
|
||||
|
||||
VERSION-gated; raises ``KeyError`` for an unknown page. Each revision is a commit sha +
|
||||
subject — the native log surfaced through the contract, not re-implemented.
|
||||
"""
|
||||
if not self.profile().supports(Verb.VERSION):
|
||||
raise NotSupported(f"{type(self).__name__} does not support version")
|
||||
if not self._path_for(key).is_file():
|
||||
raise KeyError(key)
|
||||
out = self._git("log", "--format=%H%x00%s", "--", f"{key}.md").decode()
|
||||
revisions = []
|
||||
for line in out.splitlines():
|
||||
sha, _, message = line.partition("\x00")
|
||||
revisions.append(PageRevision(sha=sha, message=message))
|
||||
return tuple(revisions)
|
||||
|
||||
# -- git plumbing --------------------------------------------------------
|
||||
|
||||
def _path_for(self, key: str) -> Path:
|
||||
return self._repo / f"{key}.md"
|
||||
|
||||
def _git(self, *args: str, stdin: bytes | None = None, env: dict | None = None) -> bytes:
|
||||
return self._run(*args, stdin=stdin, env=env, check=True).stdout
|
||||
|
||||
def _run(
|
||||
self, *args: str, stdin: bytes | None = None, env: dict | None = None, check: bool = False
|
||||
) -> subprocess.CompletedProcess:
|
||||
return subprocess.run(
|
||||
["git", "-C", str(self._repo), *args],
|
||||
input=stdin,
|
||||
capture_output=True,
|
||||
env={**os.environ, **(env or {})},
|
||||
check=check,
|
||||
)
|
||||
58
src/shard_wiki/coordination/__init__.py
Normal file
58
src/shard_wiki/coordination/__init__.py
Normal file
@@ -0,0 +1,58 @@
|
||||
"""coordination/ — the event-sourced decision log (L3, coordination-canonical state)."""
|
||||
|
||||
from shard_wiki.coordination.decision_log import (
|
||||
CoordinationState,
|
||||
DecisionEvent,
|
||||
DecisionLog,
|
||||
EventStore,
|
||||
EventType,
|
||||
InMemoryEventStore,
|
||||
deserialize_event,
|
||||
serialize_event,
|
||||
)
|
||||
from shard_wiki.coordination.append_authority import (
|
||||
AppendAuthority,
|
||||
Lease,
|
||||
LeaseHeld,
|
||||
LeaseRegistry,
|
||||
)
|
||||
from shard_wiki.coordination.git_event_store import GitEventStore
|
||||
from shard_wiki.coordination.migration import (
|
||||
export_jsonl,
|
||||
import_jsonl,
|
||||
import_log,
|
||||
migrate_space,
|
||||
)
|
||||
from shard_wiki.coordination.overlay import (
|
||||
ApplyResult,
|
||||
ApplyStatus,
|
||||
Overlay,
|
||||
OverlayEngine,
|
||||
)
|
||||
from shard_wiki.coordination.patch import Patch, render_patch
|
||||
|
||||
__all__ = [
|
||||
"DecisionLog",
|
||||
"DecisionEvent",
|
||||
"EventType",
|
||||
"CoordinationState",
|
||||
"EventStore",
|
||||
"InMemoryEventStore",
|
||||
"GitEventStore",
|
||||
"Lease",
|
||||
"LeaseHeld",
|
||||
"LeaseRegistry",
|
||||
"AppendAuthority",
|
||||
"import_log",
|
||||
"migrate_space",
|
||||
"export_jsonl",
|
||||
"import_jsonl",
|
||||
"serialize_event",
|
||||
"deserialize_event",
|
||||
"Overlay",
|
||||
"OverlayEngine",
|
||||
"ApplyStatus",
|
||||
"ApplyResult",
|
||||
"Patch",
|
||||
"render_patch",
|
||||
]
|
||||
158
src/shard_wiki/coordination/append_authority.py
Normal file
158
src/shard_wiki/coordination/append_authority.py
Normal file
@@ -0,0 +1,158 @@
|
||||
"""Per-space append authority — the single-writer lease over the log (SHARD-WP-0009 T2).
|
||||
|
||||
The log is a *total order per space* (§8.6). :class:`~shard_wiki.coordination.git_event_store`
|
||||
makes a fork physically impossible via compare-and-swap; this layer adds the **policy** that gives
|
||||
the order a single designated writer: a **per-space lease**. At most one node holds a space's lease
|
||||
at a time; only the holder writes to the store. A non-holder does not write — it **forwards** its
|
||||
append intent to the current holder, so intents from anywhere still land in one serialized stream.
|
||||
|
||||
The lease is **time-bounded and re-grantable** (HA): if a holder dies, its lease expires and a new
|
||||
node may take it, resuming appends from the log head (``seq`` stays contiguous across the hand-off).
|
||||
A node holding a *stale* lease (already re-granted elsewhere) cannot write either — it discovers it
|
||||
is no longer the holder and forwards instead, so a partitioned ex-holder can never fork the log.
|
||||
|
||||
Mechanism over policy (CLAUDE.md): this provides the leasing *primitive*; who acquires when, and
|
||||
the TTL, are the caller's policy. Single-coordinator only — distributed multi-node leasing and log
|
||||
sharding are explicit non-goals of this workplan.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import uuid
|
||||
from collections.abc import Callable, Mapping
|
||||
from dataclasses import dataclass
|
||||
from datetime import datetime, timedelta, timezone
|
||||
from typing import Any
|
||||
|
||||
from shard_wiki.coordination.decision_log import DecisionEvent, EventStore, EventType
|
||||
|
||||
__all__ = ["Lease", "LeaseHeld", "LeaseRegistry", "AppendAuthority"]
|
||||
|
||||
|
||||
def _utcnow() -> datetime:
|
||||
return datetime.now(tz=timezone.utc)
|
||||
|
||||
|
||||
@dataclass(frozen=True, slots=True)
|
||||
class Lease:
|
||||
"""A time-bounded grant of single-writer authority over one space."""
|
||||
|
||||
space: str
|
||||
holder: str
|
||||
token: str
|
||||
expires_at: datetime
|
||||
|
||||
def valid_at(self, now: datetime) -> bool:
|
||||
return now < self.expires_at
|
||||
|
||||
|
||||
class LeaseHeld(Exception):
|
||||
"""Raised when a space's lease is validly held by a different node."""
|
||||
|
||||
def __init__(self, lease: Lease) -> None:
|
||||
super().__init__(
|
||||
f"space {lease.space!r} leased to {lease.holder!r} until {lease.expires_at}"
|
||||
)
|
||||
self.lease = lease
|
||||
|
||||
|
||||
class LeaseRegistry:
|
||||
"""The single coordinator's grant table: at most one *valid* lease per space.
|
||||
|
||||
A lease that has expired is freely re-grantable to any node (the HA replacement path); a still
|
||||
valid lease is exclusive to its holder (renewable by that holder). The registry also routes
|
||||
forwarded append intents to the current holder node.
|
||||
"""
|
||||
|
||||
def __init__(self, clock: Callable[[], datetime] = _utcnow) -> None:
|
||||
self._clock = clock
|
||||
self._leases: dict[str, Lease] = {}
|
||||
self._nodes: dict[str, AppendAuthority] = {}
|
||||
|
||||
def register(self, node: AppendAuthority) -> None:
|
||||
self._nodes[node.node_id] = node
|
||||
|
||||
def grant(self, space: str, holder: str, ttl_seconds: float) -> Lease:
|
||||
"""Grant/renew the lease for ``space`` to ``holder``; raise :class:`LeaseHeld` if another
|
||||
node still holds it validly. An expired lease is re-grantable to anyone."""
|
||||
now = self._clock()
|
||||
current = self._leases.get(space)
|
||||
if current is not None and current.valid_at(now) and current.holder != holder:
|
||||
raise LeaseHeld(current)
|
||||
lease = Lease(
|
||||
space=space,
|
||||
holder=holder,
|
||||
token=uuid.uuid4().hex,
|
||||
expires_at=now + timedelta(seconds=ttl_seconds),
|
||||
)
|
||||
self._leases[space] = lease
|
||||
return lease
|
||||
|
||||
def current(self, space: str) -> Lease | None:
|
||||
"""The lease for ``space`` if one is currently valid, else None (expired/absent)."""
|
||||
lease = self._leases.get(space)
|
||||
return lease if lease is not None and lease.valid_at(self._clock()) else None
|
||||
|
||||
def holder_node(self, space: str) -> AppendAuthority | None:
|
||||
lease = self.current(space)
|
||||
return self._nodes.get(lease.holder) if lease is not None else None
|
||||
|
||||
|
||||
class AppendAuthority:
|
||||
"""A coordinator node that appends to the shared log only when it holds the space's lease.
|
||||
|
||||
Nodes share one :class:`EventStore` and one :class:`LeaseRegistry`. ``append`` routes itself:
|
||||
the holder writes; a non-holder forwards to whoever holds the lease (acquiring it first if the
|
||||
space is currently unleased). The append API mirrors :class:`EventStore` so the authority is a
|
||||
drop-in single-writer guard.
|
||||
"""
|
||||
|
||||
def __init__(
|
||||
self,
|
||||
node_id: str,
|
||||
store: EventStore,
|
||||
registry: LeaseRegistry,
|
||||
ttl_seconds: float = 30.0,
|
||||
) -> None:
|
||||
self.node_id = node_id
|
||||
self._store = store
|
||||
self._registry = registry
|
||||
self._ttl = ttl_seconds
|
||||
registry.register(self)
|
||||
|
||||
def acquire(self, space: str) -> Lease:
|
||||
"""Take (or renew) the lease for ``space``. Raises :class:`LeaseHeld` if another node holds
|
||||
it validly."""
|
||||
return self._registry.grant(space, self.node_id, self._ttl)
|
||||
|
||||
def holds(self, space: str) -> bool:
|
||||
lease = self._registry.current(space)
|
||||
return lease is not None and lease.holder == self.node_id
|
||||
|
||||
def append(
|
||||
self,
|
||||
space: str,
|
||||
type: EventType,
|
||||
payload: Mapping[str, Any],
|
||||
actor: str | None = None,
|
||||
) -> DecisionEvent:
|
||||
"""Append via the single authority. If we hold the lease, write; otherwise forward to the
|
||||
holder. If the space is unleased, acquire it first. A node with a *stale* lease forwards
|
||||
(it is not the current holder) rather than writing — so it cannot fork the log."""
|
||||
holder_node = self._registry.holder_node(space)
|
||||
if holder_node is None:
|
||||
self.acquire(space) # unleased: take authority, then write below
|
||||
holder_node = self
|
||||
if holder_node is self:
|
||||
return self._store.append(space, type, payload, actor=actor)
|
||||
return holder_node._write(space, type, payload, actor=actor)
|
||||
|
||||
def _write(
|
||||
self,
|
||||
space: str,
|
||||
type: EventType,
|
||||
payload: Mapping[str, Any],
|
||||
actor: str | None,
|
||||
) -> DecisionEvent:
|
||||
"""Apply a forwarded intent. Called only on the lease holder by a forwarding peer."""
|
||||
return self._store.append(space, type, payload, actor=actor)
|
||||
208
src/shard_wiki/coordination/decision_log.py
Normal file
208
src/shard_wiki/coordination/decision_log.py
Normal file
@@ -0,0 +1,208 @@
|
||||
"""The event-sourced coordination decision log — the keystone (CoreArchitectureBlueprint §8.1).
|
||||
|
||||
Coordination-canonical state (overlays, equivalence bindings, aliases, merges, forks) is an
|
||||
**append-only decision log**, not a mutable file; the queryable *current* state is a **derived
|
||||
fold** of the log (tier-3 disposable). The log is **totally ordered per space** via a single
|
||||
**append authority**. That total order is what gives read-your-writes across readers (§8.6).
|
||||
|
||||
Storage lives behind :class:`EventStore`: :class:`InMemoryEventStore` is the default test double
|
||||
(an in-process counter); :class:`~shard_wiki.coordination.git_event_store.GitEventStore` is the
|
||||
git-addressable backend (SHARD-WP-0009). The :class:`DecisionLog` API and the :meth:`fold` are
|
||||
identical across backends — only storage + the concurrency model differ.
|
||||
|
||||
`derived = f(canonical)`: :class:`CoordinationState` is always reproducible by replaying the log.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
from collections.abc import Mapping
|
||||
from dataclasses import dataclass, field
|
||||
from datetime import datetime, timezone
|
||||
from enum import Enum
|
||||
from types import MappingProxyType
|
||||
from typing import Any, Protocol, runtime_checkable
|
||||
|
||||
__all__ = [
|
||||
"EventType",
|
||||
"DecisionEvent",
|
||||
"CoordinationState",
|
||||
"EventStore",
|
||||
"InMemoryEventStore",
|
||||
"DecisionLog",
|
||||
"serialize_event",
|
||||
"deserialize_event",
|
||||
]
|
||||
|
||||
|
||||
class EventType(Enum):
|
||||
OVERLAY_CREATED = "overlay-created"
|
||||
BINDING_MADE = "binding-made"
|
||||
ALIAS_SET = "alias-set"
|
||||
MERGE_DECIDED = "merge-decided"
|
||||
PAGE_FORKED = "page-forked"
|
||||
|
||||
|
||||
@dataclass(frozen=True, slots=True)
|
||||
class DecisionEvent:
|
||||
"""One immutable, ordered decision. ``seq`` is the per-space total order."""
|
||||
|
||||
seq: int
|
||||
space: str
|
||||
type: EventType
|
||||
payload: Mapping[str, Any]
|
||||
actor: str | None = None
|
||||
timestamp: datetime = field(default_factory=lambda: datetime.now(tz=timezone.utc))
|
||||
|
||||
|
||||
@dataclass(frozen=True, slots=True)
|
||||
class CoordinationState:
|
||||
"""The derived fold of a space's log: current aliases + equivalence groups + open overlays.
|
||||
|
||||
Disposable (tier-3): always recomputable from the log via :meth:`DecisionLog.fold`.
|
||||
"""
|
||||
|
||||
aliases: Mapping[str, str]
|
||||
equivalence_groups: tuple[frozenset[str], ...]
|
||||
open_overlays: Mapping[str, Mapping[str, Any]]
|
||||
|
||||
def resolve_alias(self, name: str) -> str | None:
|
||||
return self.aliases.get(name)
|
||||
|
||||
def equivalent_to(self, identity: str) -> frozenset[str]:
|
||||
"""All identities equivalent to ``identity`` (including itself if bound), else just it."""
|
||||
for group in self.equivalence_groups:
|
||||
if identity in group:
|
||||
return group
|
||||
return frozenset({identity})
|
||||
|
||||
|
||||
def serialize_event(event: DecisionEvent) -> bytes:
|
||||
"""Deterministic, stable-JSON wire form of an event (same bytes for equal events, any process).
|
||||
|
||||
Sorted keys + compact separators make the serialization canonical, so a git object hashed from
|
||||
it is reproducible — the basis for content-addressable, comparable logs across backends.
|
||||
"""
|
||||
obj = {
|
||||
"seq": event.seq,
|
||||
"space": event.space,
|
||||
"type": event.type.value,
|
||||
"payload": event.payload,
|
||||
"actor": event.actor,
|
||||
"timestamp": event.timestamp.isoformat(),
|
||||
}
|
||||
return json.dumps(obj, sort_keys=True, separators=(",", ":"), ensure_ascii=False).encode()
|
||||
|
||||
|
||||
def deserialize_event(data: bytes | str) -> DecisionEvent:
|
||||
"""Inverse of :func:`serialize_event` — round-trips an event byte-for-byte by field."""
|
||||
obj = json.loads(data)
|
||||
return DecisionEvent(
|
||||
seq=obj["seq"],
|
||||
space=obj["space"],
|
||||
type=EventType(obj["type"]),
|
||||
payload=obj["payload"],
|
||||
actor=obj["actor"],
|
||||
timestamp=datetime.fromisoformat(obj["timestamp"]),
|
||||
)
|
||||
|
||||
|
||||
@runtime_checkable
|
||||
class EventStore(Protocol):
|
||||
"""Append-only, per-space ordered storage behind :class:`DecisionLog`.
|
||||
|
||||
Two bindings exist: :class:`InMemoryEventStore` (default/test double) and
|
||||
:class:`~shard_wiki.coordination.git_event_store.GitEventStore` (git-addressable). Both assign
|
||||
a per-space monotonic ``seq`` at the log head and guarantee read-your-writes for their reach
|
||||
(in-process for memory; cross-process for git).
|
||||
"""
|
||||
|
||||
def append(
|
||||
self, space: str, type: EventType, payload: Mapping[str, Any], actor: str | None = None
|
||||
) -> DecisionEvent: ...
|
||||
|
||||
def events(self, space: str) -> tuple[DecisionEvent, ...]: ...
|
||||
|
||||
|
||||
class InMemoryEventStore:
|
||||
"""In-process append-only store, totally ordered per space (the append authority for a process).
|
||||
|
||||
The default test double; the git backend preserves this exact contract on durable storage.
|
||||
"""
|
||||
|
||||
def __init__(self) -> None:
|
||||
self._events: dict[str, list[DecisionEvent]] = {}
|
||||
|
||||
def append(
|
||||
self,
|
||||
space: str,
|
||||
type: EventType,
|
||||
payload: Mapping[str, Any],
|
||||
actor: str | None = None,
|
||||
) -> DecisionEvent:
|
||||
seq = len(self._events.get(space, ()))
|
||||
event = DecisionEvent(seq=seq, space=space, type=type, payload=dict(payload), actor=actor)
|
||||
self._events.setdefault(space, []).append(event)
|
||||
return event
|
||||
|
||||
def events(self, space: str) -> tuple[DecisionEvent, ...]:
|
||||
return tuple(self._events.get(space, ()))
|
||||
|
||||
|
||||
class DecisionLog:
|
||||
"""Append-only decision log, totally ordered per space, with a derived :meth:`fold`.
|
||||
|
||||
Storage is delegated to an :class:`EventStore` (default :class:`InMemoryEventStore`); swapping
|
||||
in the git backend changes only durability + the concurrency model, not this API or the fold.
|
||||
"""
|
||||
|
||||
def __init__(self, store: EventStore | None = None) -> None:
|
||||
self._store: EventStore = store if store is not None else InMemoryEventStore()
|
||||
|
||||
def append(
|
||||
self,
|
||||
space: str,
|
||||
type: EventType,
|
||||
payload: Mapping[str, Any],
|
||||
actor: str | None = None,
|
||||
) -> DecisionEvent:
|
||||
return self._store.append(space, type, payload, actor=actor)
|
||||
|
||||
def events(self, space: str) -> tuple[DecisionEvent, ...]:
|
||||
"""The space's events in append (total) order. Read-your-writes: a just-appended event
|
||||
is present immediately."""
|
||||
return self._store.events(space)
|
||||
|
||||
def fold(self, space: str) -> CoordinationState:
|
||||
"""Replay the log into current coordination state (derived = f(log))."""
|
||||
aliases: dict[str, str] = {}
|
||||
overlays: dict[str, dict[str, Any]] = {}
|
||||
groups: list[set[str]] = []
|
||||
|
||||
for event in self.events(space):
|
||||
if event.type is EventType.ALIAS_SET:
|
||||
aliases[event.payload["alias"]] = event.payload["target"]
|
||||
elif event.type is EventType.BINDING_MADE:
|
||||
_merge_group(groups, {str(m) for m in event.payload["members"]})
|
||||
elif event.type is EventType.OVERLAY_CREATED:
|
||||
overlays[event.payload["overlay_id"]] = dict(event.payload)
|
||||
elif event.type is EventType.MERGE_DECIDED:
|
||||
# A merge resolution may collapse an overlay; minimal handling for the slice.
|
||||
overlays.pop(event.payload.get("overlay_id", ""), None)
|
||||
elif event.type is EventType.PAGE_FORKED:
|
||||
_merge_group(groups, {str(event.payload["source"]), str(event.payload["fork"])})
|
||||
|
||||
return CoordinationState(
|
||||
aliases=MappingProxyType(dict(aliases)),
|
||||
equivalence_groups=tuple(frozenset(g) for g in groups),
|
||||
open_overlays=MappingProxyType({k: MappingProxyType(v) for k, v in overlays.items()}),
|
||||
)
|
||||
|
||||
|
||||
def _merge_group(groups: list[set[str]], members: set[str]) -> None:
|
||||
"""Union-merge ``members`` into ``groups`` (any existing group sharing a member absorbs it)."""
|
||||
touching = [g for g in groups if g & members]
|
||||
for g in touching:
|
||||
groups.remove(g)
|
||||
members |= g
|
||||
groups.append(members)
|
||||
172
src/shard_wiki/coordination/git_event_store.py
Normal file
172
src/shard_wiki/coordination/git_event_store.py
Normal file
@@ -0,0 +1,172 @@
|
||||
"""GitEventStore — a git-addressable binding of :class:`EventStore` (SHARD-WP-0009 T1).
|
||||
|
||||
Each space is a ref (``refs/spaces/<sha1(space)>``); each ``append`` writes the event as an
|
||||
immutable git object (a one-blob tree committed onto the ref) and advances the ref. The commit
|
||||
chain *is* the totally ordered log: ``seq`` is the depth, ``events`` walks first-parent from the
|
||||
head oldest→newest. Coordination-canonical state therefore inherits git's history / patch /
|
||||
review / backup affordances (I-6) and is read-your-writes correct across processes.
|
||||
|
||||
The total order is enforced at storage by a **compare-and-swap** ref update
|
||||
(``git update-ref <ref> <new> <old>``): two appenders racing off the same head — the loser's CAS
|
||||
fails and it retries off the new head, so a non-holder can never fork the log. The lease layer
|
||||
(T2) sits *above* this as the append-authority policy; CAS is the mechanism that makes it safe.
|
||||
|
||||
Implemented over the ``git`` CLI through :mod:`subprocess` — zero runtime dependencies.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import hashlib
|
||||
import os
|
||||
import subprocess
|
||||
from collections.abc import Mapping
|
||||
from pathlib import Path
|
||||
from typing import Any
|
||||
|
||||
from shard_wiki.coordination.decision_log import (
|
||||
DecisionEvent,
|
||||
EventType,
|
||||
deserialize_event,
|
||||
serialize_event,
|
||||
)
|
||||
|
||||
__all__ = ["GitEventStore"]
|
||||
|
||||
# Fixed identity so commit objects are reproducible and never prompt for git config; the event's
|
||||
# own timestamp/actor carry the real provenance, the commit is just the ordered container.
|
||||
_GIT_IDENTITY = {
|
||||
"GIT_AUTHOR_NAME": "shard-wiki",
|
||||
"GIT_AUTHOR_EMAIL": "coordination@shard-wiki",
|
||||
"GIT_COMMITTER_NAME": "shard-wiki",
|
||||
"GIT_COMMITTER_EMAIL": "coordination@shard-wiki",
|
||||
}
|
||||
_EVENT_PATH = "event.json"
|
||||
_MAX_CAS_RETRIES = 50
|
||||
|
||||
|
||||
class GitEventStore:
|
||||
"""Git-backed, append-only, per-space ordered event store (an :class:`EventStore`)."""
|
||||
|
||||
def __init__(self, repo_path: str | Path) -> None:
|
||||
self.repo_path = Path(repo_path)
|
||||
self.repo_path.mkdir(parents=True, exist_ok=True)
|
||||
if not (self.repo_path / "HEAD").exists() and not (self.repo_path / ".git").exists():
|
||||
self._git("init", "--quiet", str(self.repo_path), at_cwd=True)
|
||||
|
||||
# -- EventStore contract -------------------------------------------------
|
||||
|
||||
def append(
|
||||
self,
|
||||
space: str,
|
||||
type: EventType,
|
||||
payload: Mapping[str, Any],
|
||||
actor: str | None = None,
|
||||
) -> DecisionEvent:
|
||||
"""Append one event, advancing the space ref under compare-and-swap (retry-on-race)."""
|
||||
ref = self._ref(space)
|
||||
for _ in range(_MAX_CAS_RETRIES):
|
||||
head = self._head(ref)
|
||||
seq = self._count(ref, head)
|
||||
event = DecisionEvent(
|
||||
seq=seq, space=space, type=type, payload=dict(payload), actor=actor
|
||||
)
|
||||
commit = self._commit_event(event, parent=head)
|
||||
if self._cas_update(ref, new=commit, old=head):
|
||||
return event
|
||||
raise RuntimeError(f"append contention on {space!r}: exhausted {_MAX_CAS_RETRIES} retries")
|
||||
|
||||
def import_event(self, event: DecisionEvent) -> None:
|
||||
"""Replay one pre-existing event *verbatim* (preserving seq / timestamp / actor) onto its
|
||||
space ref — the one-time migration path (SHARD-WP-0009 T4), not a live append.
|
||||
|
||||
Refuses out-of-order import so the imported chain stays a contiguous total order; preserving
|
||||
the original fields keeps provenance intact (union-without-erasure) rather than restamping.
|
||||
"""
|
||||
ref = self._ref(event.space)
|
||||
head = self._head(ref)
|
||||
expected = self._count(ref, head)
|
||||
if event.seq != expected:
|
||||
raise ValueError(
|
||||
f"out-of-order import on {event.space!r}: expected seq {expected}, got {event.seq}"
|
||||
)
|
||||
commit = self._commit_event(event, parent=head)
|
||||
if not self._cas_update(ref, new=commit, old=head):
|
||||
raise RuntimeError(f"import race on {ref}")
|
||||
|
||||
def events(self, space: str) -> tuple[DecisionEvent, ...]:
|
||||
"""The space's events oldest→newest (append/total order)."""
|
||||
ref = self._ref(space)
|
||||
head = self._head(ref)
|
||||
if head is None:
|
||||
return ()
|
||||
shas = self._git("rev-list", "--reverse", "--first-parent", ref).decode().split()
|
||||
return tuple(
|
||||
deserialize_event(self._git("cat-file", "blob", f"{sha}:{_EVENT_PATH}"))
|
||||
for sha in shas
|
||||
)
|
||||
|
||||
# -- git plumbing --------------------------------------------------------
|
||||
|
||||
def _commit_event(self, event: DecisionEvent, parent: str | None) -> str:
|
||||
blob = self._git(
|
||||
"hash-object", "-w", "--stdin", stdin=serialize_event(event)
|
||||
).decode().strip()
|
||||
tree = self._git(
|
||||
"mktree", stdin=f"100644 blob {blob}\t{_EVENT_PATH}\n".encode()
|
||||
).decode().strip()
|
||||
args = ["commit-tree", tree, "-m", f"event {event.seq} {event.type.value}"]
|
||||
if parent is not None:
|
||||
args += ["-p", parent]
|
||||
# Pin the commit date to the event's timestamp for reproducible objects.
|
||||
date = event.timestamp.isoformat()
|
||||
env = {**_GIT_IDENTITY, "GIT_AUTHOR_DATE": date, "GIT_COMMITTER_DATE": date}
|
||||
return self._git(*args, env=env).decode().strip()
|
||||
|
||||
def _cas_update(self, ref: str, new: str, old: str | None) -> bool:
|
||||
"""``git update-ref`` with the old value as a CAS guard (empty oldvalue == must-not-exist).
|
||||
|
||||
Returns False if the ref moved since we read ``old`` (lost the race) — the caller retries.
|
||||
"""
|
||||
result = self._run("update-ref", ref, new, old if old is not None else "")
|
||||
return result.returncode == 0
|
||||
|
||||
def _head(self, ref: str) -> str | None:
|
||||
result = self._run("rev-parse", "--verify", "--quiet", ref)
|
||||
out = result.stdout.decode().strip()
|
||||
return out or None
|
||||
|
||||
def _count(self, ref: str, head: str | None) -> int:
|
||||
if head is None:
|
||||
return 0
|
||||
return int(self._git("rev-list", "--count", "--first-parent", ref).decode().strip())
|
||||
|
||||
@staticmethod
|
||||
def _ref(space: str) -> str:
|
||||
return f"refs/spaces/{hashlib.sha1(space.encode()).hexdigest()}"
|
||||
|
||||
def _git(
|
||||
self,
|
||||
*args: str,
|
||||
stdin: bytes | None = None,
|
||||
env: dict | None = None,
|
||||
at_cwd: bool = False,
|
||||
) -> bytes:
|
||||
result = self._run(*args, stdin=stdin, env=env, at_cwd=at_cwd, check=True)
|
||||
return result.stdout
|
||||
|
||||
def _run(
|
||||
self,
|
||||
*args: str,
|
||||
stdin: bytes | None = None,
|
||||
env: dict | None = None,
|
||||
at_cwd: bool = False,
|
||||
check: bool = False,
|
||||
) -> subprocess.CompletedProcess:
|
||||
base = ["git"] if at_cwd else ["git", "-C", str(self.repo_path)]
|
||||
return subprocess.run(
|
||||
[*base, *args],
|
||||
input=stdin,
|
||||
capture_output=True,
|
||||
env={**os.environ, **(env or {})},
|
||||
check=check,
|
||||
)
|
||||
53
src/shard_wiki/coordination/migration.py
Normal file
53
src/shard_wiki/coordination/migration.py
Normal file
@@ -0,0 +1,53 @@
|
||||
"""One-time migration of a coordination log into git (SHARD-WP-0009 T4).
|
||||
|
||||
Replays an existing decision log — an in-memory store, or a JSON-lines export — into a
|
||||
:class:`GitEventStore`, preserving each event verbatim (seq / timestamp / actor) so provenance
|
||||
survives the move (union-without-erasure). After migration the same :meth:`DecisionLog.fold`
|
||||
reproduces identical coordination state; only durability changes.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
from collections.abc import Iterable
|
||||
from pathlib import Path
|
||||
|
||||
from shard_wiki.coordination.decision_log import (
|
||||
DecisionEvent,
|
||||
EventStore,
|
||||
deserialize_event,
|
||||
serialize_event,
|
||||
)
|
||||
from shard_wiki.coordination.git_event_store import GitEventStore
|
||||
|
||||
__all__ = ["import_log", "migrate_space", "export_jsonl", "import_jsonl"]
|
||||
|
||||
|
||||
def import_log(events: Iterable[DecisionEvent], dest: GitEventStore) -> int:
|
||||
"""Replay ``events`` (in space/seq order) into ``dest``. Returns the count imported."""
|
||||
count = 0
|
||||
for event in events:
|
||||
dest.import_event(event)
|
||||
count += 1
|
||||
return count
|
||||
|
||||
|
||||
def migrate_space(source: EventStore, space: str, dest: GitEventStore) -> int:
|
||||
"""Migrate one space's log from any :class:`EventStore` into the git backend verbatim."""
|
||||
return import_log(source.events(space), dest)
|
||||
|
||||
|
||||
def export_jsonl(events: Iterable[DecisionEvent], path: str | Path) -> int:
|
||||
"""Write events as newline-delimited canonical JSON (a portable, diffable log export)."""
|
||||
count = 0
|
||||
with open(path, "wb") as handle:
|
||||
for event in events:
|
||||
handle.write(serialize_event(event) + b"\n")
|
||||
count += 1
|
||||
return count
|
||||
|
||||
|
||||
def import_jsonl(path: str | Path, dest: GitEventStore) -> int:
|
||||
"""Replay a JSON-lines export (see :func:`export_jsonl`) into the git backend."""
|
||||
with open(path, "rb") as handle:
|
||||
events = [deserialize_event(line) for line in handle if line.strip()]
|
||||
return import_log(events, dest)
|
||||
134
src/shard_wiki/coordination/overlay.py
Normal file
134
src/shard_wiki/coordination/overlay.py
Normal file
@@ -0,0 +1,134 @@
|
||||
"""Overlay engine — overlay-before-mutation (FederationRequirements ADR-05, blueprint §8.2).
|
||||
|
||||
An overlay is a non-destructive local edit against a page. It is **coordination-canonical**: an
|
||||
``OVERLAY_CREATED`` event in the decision log (§8.1), not a mutable side file. The current set
|
||||
of open overlays is the log fold. ``draft`` records one; ``apply`` (T4) resolves it under drift.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import uuid
|
||||
from collections.abc import Mapping
|
||||
from dataclasses import dataclass
|
||||
from enum import Enum
|
||||
from typing import Any
|
||||
|
||||
from shard_wiki.adapters import ShardAdapter
|
||||
from shard_wiki.coordination.decision_log import DecisionLog, EventType
|
||||
from shard_wiki.model import Identity, Page, Verb
|
||||
from shard_wiki.provenance import OverlayState
|
||||
|
||||
__all__ = ["Overlay", "OverlayEngine", "ApplyStatus", "ApplyResult"]
|
||||
|
||||
|
||||
@dataclass(frozen=True, slots=True)
|
||||
class Overlay:
|
||||
"""A non-destructive edit: the proposed ``body`` for ``target``, authored against
|
||||
``base_rev`` (the shard revision seen at draft time, for drift detection)."""
|
||||
|
||||
overlay_id: str
|
||||
target: Identity
|
||||
base_rev: str | None
|
||||
body: str
|
||||
state: OverlayState = OverlayState.DRAFT
|
||||
|
||||
def to_payload(self) -> dict[str, Any]:
|
||||
return {
|
||||
"overlay_id": self.overlay_id,
|
||||
"target_shard": self.target.shard,
|
||||
"target_key": self.target.key,
|
||||
"base_rev": self.base_rev,
|
||||
"body": self.body,
|
||||
"state": self.state.value,
|
||||
}
|
||||
|
||||
@classmethod
|
||||
def from_payload(cls, payload: Mapping[str, Any]) -> Overlay:
|
||||
return cls(
|
||||
overlay_id=payload["overlay_id"],
|
||||
target=Identity(payload["target_shard"], payload["target_key"]),
|
||||
base_rev=payload["base_rev"],
|
||||
body=payload["body"],
|
||||
state=OverlayState(payload.get("state", OverlayState.DRAFT.value)),
|
||||
)
|
||||
|
||||
|
||||
class ApplyStatus(Enum):
|
||||
APPLIED = "applied" # fast-forwarded and written through
|
||||
REFUSED_DRIFT = "refused-drift" # source moved under the overlay; no clobber
|
||||
KEPT_DRAFT = "kept-draft" # target read-only; overlay remains the local truth
|
||||
|
||||
|
||||
@dataclass(frozen=True, slots=True)
|
||||
class ApplyResult:
|
||||
status: ApplyStatus
|
||||
overlay_id: str
|
||||
page: Page | None = None
|
||||
detail: str = ""
|
||||
|
||||
|
||||
class OverlayEngine:
|
||||
def __init__(self, space: str, log: DecisionLog) -> None:
|
||||
self.space = space
|
||||
self.log = log
|
||||
|
||||
def draft(
|
||||
self,
|
||||
target: Identity,
|
||||
body: str,
|
||||
base_rev: str | None,
|
||||
actor: str | None = None,
|
||||
) -> Overlay:
|
||||
"""Create a draft overlay and record it in the decision log (coordination-canonical)."""
|
||||
overlay = Overlay(uuid.uuid4().hex, target, base_rev, body)
|
||||
self.log.append(self.space, EventType.OVERLAY_CREATED, overlay.to_payload(), actor=actor)
|
||||
return overlay
|
||||
|
||||
def get(self, overlay_id: str) -> Overlay | None:
|
||||
payload = self.log.fold(self.space).open_overlays.get(overlay_id)
|
||||
return Overlay.from_payload(payload) if payload is not None else None
|
||||
|
||||
def open_overlays(self) -> tuple[Overlay, ...]:
|
||||
state = self.log.fold(self.space)
|
||||
return tuple(Overlay.from_payload(p) for p in state.open_overlays.values())
|
||||
|
||||
def apply(self, overlay_id: str, adapter: ShardAdapter) -> ApplyResult:
|
||||
"""Resolve an overlay against its target shard with apply-under-drift semantics (§8.6).
|
||||
|
||||
Read-only target → ``KEPT_DRAFT`` (the overlay stays the local truth). Otherwise compare
|
||||
the overlay's ``base_rev`` to the shard's current rev: equal → fast-forward write-through
|
||||
(``APPLIED``); changed → ``REFUSED_DRIFT`` (never clobber, I-5). Applying records a
|
||||
``MERGE_DECIDED`` event that closes the overlay in the fold.
|
||||
"""
|
||||
overlay = self.get(overlay_id)
|
||||
if overlay is None:
|
||||
raise KeyError(overlay_id)
|
||||
if adapter.shard_id != overlay.target.shard:
|
||||
raise ValueError(f"adapter {adapter.shard_id!r} != target {overlay.target.shard!r}")
|
||||
|
||||
if not adapter.profile().supports(Verb.WRITE):
|
||||
return ApplyResult(
|
||||
ApplyStatus.KEPT_DRAFT, overlay_id, detail="target is read-only; overlay retained"
|
||||
)
|
||||
|
||||
current = _current_rev(adapter, overlay.target.key)
|
||||
if current != overlay.base_rev:
|
||||
return ApplyResult(
|
||||
ApplyStatus.REFUSED_DRIFT,
|
||||
overlay_id,
|
||||
detail=f"base_rev {overlay.base_rev!r} != current {current!r}",
|
||||
)
|
||||
|
||||
page = adapter.write(overlay.target.key, overlay.body)
|
||||
self.log.append(
|
||||
self.space,
|
||||
EventType.MERGE_DECIDED,
|
||||
{"overlay_id": overlay_id, "outcome": "applied"},
|
||||
)
|
||||
return ApplyResult(ApplyStatus.APPLIED, overlay_id, page=page)
|
||||
|
||||
|
||||
def _current_rev(adapter: ShardAdapter, key: str) -> str | None:
|
||||
"""Best-effort current-revision probe; adapters without one are treated as no-rev."""
|
||||
probe = getattr(adapter, "current_rev", None)
|
||||
return probe(key) if callable(probe) else None
|
||||
38
src/shard_wiki/coordination/patch.py
Normal file
38
src/shard_wiki/coordination/patch.py
Normal file
@@ -0,0 +1,38 @@
|
||||
"""Patch rendering — an overlay as a reviewable diff (FederationRequirements ADR-05).
|
||||
|
||||
A pure function over (base body, overlay body): the auditable change proposal that an overlay
|
||||
becomes before it is applied. Markdown/native text in this slice (lossless); a lossy
|
||||
native-syntax-with-fidelity-report variant is later (TSD §A.6).
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import difflib
|
||||
from dataclasses import dataclass
|
||||
|
||||
from shard_wiki.coordination.overlay import Overlay
|
||||
from shard_wiki.model import Identity
|
||||
|
||||
__all__ = ["Patch", "render_patch"]
|
||||
|
||||
|
||||
@dataclass(frozen=True, slots=True)
|
||||
class Patch:
|
||||
target: Identity
|
||||
diff: str
|
||||
|
||||
@property
|
||||
def is_empty(self) -> bool:
|
||||
return self.diff == ""
|
||||
|
||||
|
||||
def render_patch(overlay: Overlay, base_body: str) -> Patch:
|
||||
"""Render ``base_body`` → ``overlay.body`` as a unified diff against the overlay target."""
|
||||
label = str(overlay.target)
|
||||
lines = difflib.unified_diff(
|
||||
base_body.splitlines(keepends=True),
|
||||
overlay.body.splitlines(keepends=True),
|
||||
fromfile=f"a/{label}",
|
||||
tofile=f"b/{label}",
|
||||
)
|
||||
return Patch(target=overlay.target, diff="".join(lines))
|
||||
49
src/shard_wiki/engine/__init__.py
Normal file
49
src/shard_wiki/engine/__init__.py
Normal file
@@ -0,0 +1,49 @@
|
||||
"""engine/ — shard-wiki's native, headless wiki engine (a canonical-mode shard backend).
|
||||
|
||||
A small page-store kernel + a typed-extension runtime (WikiEngineCoreArchitecture). The engine
|
||||
is *one shard*: it is consumed by the orchestrator only via its `EngineShardAdapter`; it never
|
||||
imports the derived tier (`union`/`projection`).
|
||||
"""
|
||||
|
||||
from shard_wiki.engine.activation import (
|
||||
ActivationContext,
|
||||
ActivationProvider,
|
||||
ActivationResolver,
|
||||
StaticProvider,
|
||||
feature_control_provider,
|
||||
)
|
||||
from shard_wiki.engine.extension import (
|
||||
ActiveExtensions,
|
||||
Extension,
|
||||
ExtensionError,
|
||||
ExtensionRuntime,
|
||||
Hook,
|
||||
)
|
||||
from shard_wiki.engine.adapter import EngineShardAdapter, build_engine_shard
|
||||
from shard_wiki.engine.kernel import EngineKernel
|
||||
from shard_wiki.engine.links import extract_wikilinks
|
||||
from shard_wiki.engine.profile import (
|
||||
ProfileContribution,
|
||||
derive_profile,
|
||||
engine_base_profile,
|
||||
)
|
||||
|
||||
__all__ = [
|
||||
"EngineKernel",
|
||||
"extract_wikilinks",
|
||||
"Hook",
|
||||
"Extension",
|
||||
"ExtensionError",
|
||||
"ExtensionRuntime",
|
||||
"ActiveExtensions",
|
||||
"ActivationContext",
|
||||
"ActivationProvider",
|
||||
"StaticProvider",
|
||||
"ActivationResolver",
|
||||
"feature_control_provider",
|
||||
"engine_base_profile",
|
||||
"ProfileContribution",
|
||||
"derive_profile",
|
||||
"EngineShardAdapter",
|
||||
"build_engine_shard",
|
||||
]
|
||||
129
src/shard_wiki/engine/activation.py
Normal file
129
src/shard_wiki/engine/activation.py
Normal file
@@ -0,0 +1,129 @@
|
||||
"""Per-shard extension activation (WikiEngineCoreArchitecture E-4/E-8, ADR-0001).
|
||||
|
||||
Decides *which registered extensions are active* for a given shard — **availability only, never
|
||||
authorization**. The mechanism is OpenFeature-shaped and **provider-pluggable**:
|
||||
|
||||
- :class:`StaticProvider` is the **standalone / L0 default** — zero external dependency, in-process
|
||||
flags with optional per-shard scoping. A kernel-only or offline shard uses this.
|
||||
- :func:`feature_control_provider` lazily wraps the helix_forge ``feature_control_sdk`` (over
|
||||
OpenFeature) when it is installed; absent, it returns ``None`` and the caller falls back to the
|
||||
static default. This mirrors shard-wiki's identity-provider ladder (local default → external
|
||||
when present), and keeps the engine core pure-stdlib.
|
||||
|
||||
An *activation profile* is ``{extension id → config}``; the active id set then feeds the
|
||||
extension runtime's `activate()` (T2) and the derived capability profile (T4, E-5).
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
from collections.abc import Iterable, Mapping
|
||||
from dataclasses import dataclass, field
|
||||
from typing import Any, Protocol, runtime_checkable
|
||||
|
||||
__all__ = [
|
||||
"ActivationContext",
|
||||
"ActivationProvider",
|
||||
"StaticProvider",
|
||||
"ActivationResolver",
|
||||
"feature_control_provider",
|
||||
]
|
||||
|
||||
|
||||
@dataclass(frozen=True, slots=True)
|
||||
class ActivationContext:
|
||||
"""Scope for an activation decision. Carries no principal/authz — availability only."""
|
||||
|
||||
shard_id: str
|
||||
tenant_id: str | None = None
|
||||
extra: Mapping[str, Any] = field(default_factory=dict)
|
||||
|
||||
def as_dict(self) -> dict[str, Any]:
|
||||
d: dict[str, Any] = {"shard_id": self.shard_id}
|
||||
if self.tenant_id is not None:
|
||||
d["tenant_id"] = self.tenant_id
|
||||
d.update(self.extra)
|
||||
return d
|
||||
|
||||
|
||||
@runtime_checkable
|
||||
class ActivationProvider(Protocol):
|
||||
"""Evaluates feature availability for an extension key in a context (OpenFeature-shaped)."""
|
||||
|
||||
def is_active(self, feature_key: str, context: Mapping[str, Any]) -> bool: ...
|
||||
|
||||
def config(self, feature_key: str, context: Mapping[str, Any]) -> Mapping[str, Any]: ...
|
||||
|
||||
|
||||
@dataclass(frozen=True, slots=True)
|
||||
class StaticProvider:
|
||||
"""The standalone default: in-process flags, optionally overridden per shard. Zero deps.
|
||||
|
||||
``flags`` is the base availability map; ``per_shard[shard_id]`` overrides it for one shard;
|
||||
``configs[feature_key]`` supplies per-extension config. Unknown keys → ``default``.
|
||||
"""
|
||||
|
||||
flags: Mapping[str, bool] = field(default_factory=dict)
|
||||
per_shard: Mapping[str, Mapping[str, bool]] = field(default_factory=dict)
|
||||
configs: Mapping[str, Mapping[str, Any]] = field(default_factory=dict)
|
||||
default: bool = False
|
||||
|
||||
def is_active(self, feature_key: str, context: Mapping[str, Any]) -> bool:
|
||||
shard = context.get("shard_id")
|
||||
scoped = self.per_shard.get(shard, {}) if shard is not None else {}
|
||||
if feature_key in scoped:
|
||||
return scoped[feature_key]
|
||||
return self.flags.get(feature_key, self.default)
|
||||
|
||||
def config(self, feature_key: str, context: Mapping[str, Any]) -> Mapping[str, Any]:
|
||||
return self.configs.get(feature_key, {})
|
||||
|
||||
|
||||
class ActivationResolver:
|
||||
"""Maps candidate extension ids → the active set / activation profile for a context."""
|
||||
|
||||
def __init__(self, provider: ActivationProvider) -> None:
|
||||
self.provider = provider
|
||||
|
||||
def active_extensions(
|
||||
self, candidate_ids: Iterable[str], context: ActivationContext
|
||||
) -> set[str]:
|
||||
ctx = context.as_dict()
|
||||
return {eid for eid in candidate_ids if self.provider.is_active(eid, ctx)}
|
||||
|
||||
def activation_profile(
|
||||
self, candidate_ids: Iterable[str], context: ActivationContext
|
||||
) -> dict[str, Mapping[str, Any]]:
|
||||
"""``{extension id → config}`` for the active subset."""
|
||||
ctx = context.as_dict()
|
||||
return {
|
||||
eid: self.provider.config(eid, ctx)
|
||||
for eid in candidate_ids
|
||||
if self.provider.is_active(eid, ctx)
|
||||
}
|
||||
|
||||
|
||||
def feature_control_provider(domain: str | None = None) -> ActivationProvider | None:
|
||||
"""Return a feature-control-backed provider if ``feature_control_sdk`` is importable, else
|
||||
``None`` (caller falls back to :class:`StaticProvider`). Lazy import keeps the engine core
|
||||
dependency-free (ADR-0001)."""
|
||||
try: # optional engine extra — not a core dependency
|
||||
from feature_control_sdk import FeatureControlClient # type: ignore
|
||||
except ImportError:
|
||||
return None
|
||||
|
||||
client = FeatureControlClient(domain=domain)
|
||||
|
||||
@dataclass(frozen=True, slots=True)
|
||||
class _FeatureControlProvider:
|
||||
_client: Any
|
||||
|
||||
def is_active(self, feature_key: str, context: Mapping[str, Any]) -> bool:
|
||||
return bool(
|
||||
self._client.get_boolean_value(feature_key, False, context=dict(context))
|
||||
)
|
||||
|
||||
def config(self, feature_key: str, context: Mapping[str, Any]) -> Mapping[str, Any]:
|
||||
getter = getattr(self._client, "get_object_value", None)
|
||||
return dict(getter(feature_key, {}, context=dict(context))) if getter else {}
|
||||
|
||||
return _FeatureControlProvider(client)
|
||||
82
src/shard_wiki/engine/adapter.py
Normal file
82
src/shard_wiki/engine/adapter.py
Normal file
@@ -0,0 +1,82 @@
|
||||
"""EngineShardAdapter — the engine exposed as a canonical-mode shard (WikiEngineCoreArchitecture
|
||||
§6, E-1/E-5).
|
||||
|
||||
The engine is *one shard*: the orchestrator consumes it only through this `ShardAdapter`. The
|
||||
adapter is backed by the kernel (T1) + a composed extension set (T2/T3); its §A capability
|
||||
profile is **derived from the active extensions** (T4), so the orchestrator sees an honest,
|
||||
conformance-verifiable profile that reflects exactly what is activated. Read/write run the
|
||||
extensions' transform hooks; everything above this stays in the orchestrator (no union/projection
|
||||
import).
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
from collections.abc import Iterable
|
||||
|
||||
from shard_wiki.adapters import ShardAdapter
|
||||
from shard_wiki.engine.activation import ActivationContext, ActivationProvider, ActivationResolver
|
||||
from shard_wiki.engine.extension import ActiveExtensions, ExtensionRuntime, Hook
|
||||
from shard_wiki.engine.kernel import EngineKernel
|
||||
from shard_wiki.engine.profile import derive_profile
|
||||
from shard_wiki.model import CapabilityProfile, NotSupported, Page, Verb
|
||||
|
||||
__all__ = ["EngineShardAdapter", "build_engine_shard"]
|
||||
|
||||
|
||||
class EngineShardAdapter(ShardAdapter):
|
||||
def __init__(
|
||||
self,
|
||||
kernel: EngineKernel,
|
||||
active: ActiveExtensions,
|
||||
base_profile: CapabilityProfile | None = None,
|
||||
) -> None:
|
||||
self._kernel = kernel
|
||||
self._active = active
|
||||
self._profile = derive_profile(active, base_profile) # validated (E-5)
|
||||
|
||||
@property
|
||||
def shard_id(self) -> str:
|
||||
return self._kernel.shard_id
|
||||
|
||||
def profile(self) -> CapabilityProfile:
|
||||
return self._profile
|
||||
|
||||
def keys(self) -> Iterable[str]:
|
||||
return self._kernel.keys()
|
||||
|
||||
def read(self, key: str) -> Page:
|
||||
page = self._kernel.read(key)
|
||||
return self._active.dispatch_transform(Hook.ON_READ, page, {"shard_id": self.shard_id})
|
||||
|
||||
def current_rev(self, key: str) -> str | None:
|
||||
return self._kernel.current_rev(key)
|
||||
|
||||
def write(self, key: str, body: str) -> Page:
|
||||
if not self._profile.supports(Verb.WRITE):
|
||||
raise NotSupported(f"{type(self).__name__} ({self.shard_id}) is read-only")
|
||||
body = self._active.dispatch_transform(
|
||||
Hook.ON_WRITE, body, {"shard_id": self.shard_id, "key": key}
|
||||
)
|
||||
return self._kernel.write(key, body)
|
||||
|
||||
|
||||
def build_engine_shard(
|
||||
shard_id: str,
|
||||
runtime: ExtensionRuntime,
|
||||
*,
|
||||
activate: Iterable[str] | None = None,
|
||||
provider: ActivationProvider | None = None,
|
||||
context: ActivationContext | None = None,
|
||||
base_profile: CapabilityProfile | None = None,
|
||||
) -> EngineShardAdapter:
|
||||
"""Stand up an engine shard: resolve which extensions are active (explicit ``activate`` ids,
|
||||
or via an activation ``provider`` over the runtime's available set), compose them, and wrap a
|
||||
fresh kernel as a `ShardAdapter`.
|
||||
"""
|
||||
if provider is not None:
|
||||
ctx = context or ActivationContext(shard_id)
|
||||
ids = ActivationResolver(provider).active_extensions(runtime.available(), ctx)
|
||||
else:
|
||||
ids = set(activate or ())
|
||||
active = runtime.activate(ids)
|
||||
return EngineShardAdapter(EngineKernel(shard_id), active, base_profile)
|
||||
165
src/shard_wiki/engine/extension.py
Normal file
165
src/shard_wiki/engine/extension.py
Normal file
@@ -0,0 +1,165 @@
|
||||
"""Typed-extension runtime — the engine framework (WikiEngineCoreArchitecture §4, E-3/E-9).
|
||||
|
||||
Everything beyond the kernel's c2-minimum is an :class:`Extension`: it declares a typed
|
||||
contract (id, provided capabilities, declared types, bound hooks, dependencies, conflicts) and
|
||||
the runtime **composes** an activation set deterministically, **rejecting impossible profiles**
|
||||
(unmet deps / conflicts / type collisions) — the §6.5 capability-as-data discipline applied to
|
||||
extensions. Extension structure is **verified at registration** (mirrors §6.6 conformance):
|
||||
bad ids or non-callable hook handlers are refused, so the framework acts on verified data.
|
||||
|
||||
Hooks are dispatched in a declared, deterministic order (dependency-topological, ties by id):
|
||||
*transform* hooks chain a payload through handlers; *collect* hooks gather contributions.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
from collections.abc import Callable, Iterable, Mapping
|
||||
from enum import Enum
|
||||
from typing import Any, ClassVar
|
||||
|
||||
__all__ = ["Hook", "Extension", "ExtensionError", "ExtensionRuntime", "ActiveExtensions"]
|
||||
|
||||
|
||||
class ExtensionError(ValueError):
|
||||
"""Raised when an extension is malformed or an activation set is impossible (§6.5)."""
|
||||
|
||||
|
||||
class Hook(Enum):
|
||||
# transform hooks: each handler takes (payload, ctx) and returns the next payload
|
||||
ON_WRITE = "on_write" # transform a draft before persist
|
||||
ON_READ = "on_read" # transform a page on read
|
||||
ON_RESOLVE = "on_resolve" # transform a name resolution
|
||||
ON_RENDER = "on_render" # produce a derived representation
|
||||
# collect hooks: each handler takes (payload, ctx) and returns a contribution
|
||||
ON_LINK = "on_link" # contribute link/transclusion edges
|
||||
ON_QUERY = "on_query" # answer a query
|
||||
ON_PROFILE = "on_profile" # contribute capability-profile positions (E-5)
|
||||
|
||||
|
||||
_TRANSFORM = frozenset({Hook.ON_WRITE, Hook.ON_READ, Hook.ON_RESOLVE, Hook.ON_RENDER})
|
||||
_COLLECT = frozenset({Hook.ON_LINK, Hook.ON_QUERY, Hook.ON_PROFILE})
|
||||
|
||||
|
||||
class Extension:
|
||||
"""Base class for a typed extension. Subclasses set the class vars and override
|
||||
:meth:`hooks` to bind handlers (signature ``handler(payload, ctx) -> result``)."""
|
||||
|
||||
id: ClassVar[str] = ""
|
||||
provides: ClassVar[tuple[str, ...]] = ()
|
||||
declares_types: ClassVar[tuple[str, ...]] = ()
|
||||
depends_on: ClassVar[tuple[str, ...]] = ()
|
||||
conflicts_with: ClassVar[tuple[str, ...]] = ()
|
||||
|
||||
def hooks(self) -> Mapping[Hook, Callable[[Any, Any], Any]]:
|
||||
return {}
|
||||
|
||||
|
||||
class ActiveExtensions:
|
||||
"""A composed, ordered activation set with deterministic hook dispatch."""
|
||||
|
||||
def __init__(self, ordered: list[Extension]) -> None:
|
||||
self._ordered = ordered
|
||||
self.ids: tuple[str, ...] = tuple(e.id for e in ordered)
|
||||
self._tables: dict[Hook, list[tuple[str, Callable[[Any, Any], Any]]]] = {}
|
||||
for ext in ordered:
|
||||
for hook, fn in ext.hooks().items():
|
||||
self._tables.setdefault(hook, []).append((ext.id, fn))
|
||||
|
||||
def handlers(self, hook: Hook) -> tuple[str, ...]:
|
||||
"""The extension ids bound to ``hook``, in dispatch order (for introspection)."""
|
||||
return tuple(eid for eid, _ in self._tables.get(hook, ()))
|
||||
|
||||
def dispatch_transform(self, hook: Hook, payload: Any, ctx: Any = None) -> Any:
|
||||
if hook not in _TRANSFORM:
|
||||
raise ExtensionError(f"{hook} is not a transform hook")
|
||||
for _eid, fn in self._tables.get(hook, ()):
|
||||
payload = fn(payload, ctx)
|
||||
return payload
|
||||
|
||||
def dispatch_collect(self, hook: Hook, payload: Any = None, ctx: Any = None) -> list[Any]:
|
||||
if hook not in _COLLECT:
|
||||
raise ExtensionError(f"{hook} is not a collect hook")
|
||||
return [fn(payload, ctx) for _eid, fn in self._tables.get(hook, ())]
|
||||
|
||||
|
||||
class ExtensionRuntime:
|
||||
def __init__(self) -> None:
|
||||
self._registered: dict[str, Extension] = {}
|
||||
|
||||
def available(self) -> frozenset[str]:
|
||||
"""Ids of all registered extensions (the candidate set for activation)."""
|
||||
return frozenset(self._registered)
|
||||
|
||||
def register(self, ext: Extension) -> Extension:
|
||||
"""Register an extension after structural verification (mirrors §6.6)."""
|
||||
if not ext.id or not ext.id.startswith("ext."):
|
||||
raise ExtensionError(f"extension id must be 'ext.<name>', got {ext.id!r}")
|
||||
if ext.id in self._registered:
|
||||
raise ExtensionError(f"duplicate extension id: {ext.id}")
|
||||
bound = ext.hooks()
|
||||
for hook, fn in bound.items():
|
||||
if not isinstance(hook, Hook):
|
||||
raise ExtensionError(f"{ext.id}: hook key {hook!r} is not a Hook")
|
||||
if not callable(fn):
|
||||
raise ExtensionError(f"{ext.id}: handler for {hook} is not callable")
|
||||
self._registered[ext.id] = ext
|
||||
return ext
|
||||
|
||||
def activate(self, ids: Iterable[str]) -> ActiveExtensions:
|
||||
"""Compose an activation set: dependency closure → conflict/type checks → deterministic
|
||||
order. Raises :class:`ExtensionError` on an impossible profile."""
|
||||
requested = set(ids)
|
||||
unknown = requested - self._registered.keys()
|
||||
if unknown:
|
||||
raise ExtensionError(f"unknown extensions: {sorted(unknown)}")
|
||||
|
||||
# dependency closure
|
||||
active: set[str] = set()
|
||||
frontier = list(requested)
|
||||
while frontier:
|
||||
eid = frontier.pop()
|
||||
if eid in active:
|
||||
continue
|
||||
ext = self._registered.get(eid)
|
||||
if ext is None:
|
||||
raise ExtensionError(f"unmet dependency: {eid}")
|
||||
active.add(eid)
|
||||
frontier.extend(d for d in ext.depends_on if d not in active)
|
||||
|
||||
exts = [self._registered[e] for e in active]
|
||||
|
||||
# conflicts
|
||||
for ext in exts:
|
||||
clash = active & set(ext.conflicts_with)
|
||||
if clash:
|
||||
raise ExtensionError(f"{ext.id} conflicts with active {sorted(clash)}")
|
||||
|
||||
# type collisions (two active extensions claiming the same type id)
|
||||
owner: dict[str, str] = {}
|
||||
for ext in exts:
|
||||
for t in ext.declares_types:
|
||||
if t in owner:
|
||||
raise ExtensionError(
|
||||
f"type collision on {t!r}: {owner[t]} and {ext.id}"
|
||||
)
|
||||
owner[t] = ext.id
|
||||
|
||||
return ActiveExtensions(self._topo_order(exts))
|
||||
|
||||
def _topo_order(self, exts: list[Extension]) -> list[Extension]:
|
||||
"""Dependencies before dependents; ties broken by id (deterministic)."""
|
||||
by_id = {e.id: e for e in exts}
|
||||
ordered: list[Extension] = []
|
||||
placed: set[str] = set()
|
||||
|
||||
def visit(ext: Extension) -> None:
|
||||
if ext.id in placed:
|
||||
return
|
||||
for dep in sorted(d for d in ext.depends_on if d in by_id):
|
||||
visit(by_id[dep])
|
||||
placed.add(ext.id)
|
||||
ordered.append(ext)
|
||||
|
||||
for ext in sorted(exts, key=lambda e: e.id):
|
||||
visit(ext)
|
||||
return ordered
|
||||
10
src/shard_wiki/engine/extensions/__init__.py
Normal file
10
src/shard_wiki/engine/extensions/__init__.py
Normal file
@@ -0,0 +1,10 @@
|
||||
"""engine/extensions/ — built-in typed extensions for the wiki engine.
|
||||
|
||||
Each is a typed :class:`~shard_wiki.engine.extension.Extension` a shard activates only if needed.
|
||||
``ext.struct`` (typed records) is the first; more (views, addressing, computational, authz) follow
|
||||
the same pattern.
|
||||
"""
|
||||
|
||||
from shard_wiki.engine.extensions.struct import StructExt, parse_frontmatter
|
||||
|
||||
__all__ = ["StructExt", "parse_frontmatter"]
|
||||
81
src/shard_wiki/engine/extensions/struct.py
Normal file
81
src/shard_wiki/engine/extensions/struct.py
Normal file
@@ -0,0 +1,81 @@
|
||||
"""ext.struct — typed records, a first built-in extension (WikiEngineCoreArchitecture X-STRUCT).
|
||||
|
||||
Demonstrates the typed-extension framework end-to-end. A page may carry a leading in-text
|
||||
frontmatter block (`---` … `---`, `key: value` lines — git-diffable structure, blueprint T12).
|
||||
With this extension **active**, the engine:
|
||||
|
||||
- **ON_WRITE** validates the structured block (optionally against an allowed-field set) — a
|
||||
malformed/disallowed structured page is rejected; the body is otherwise unchanged
|
||||
(content-preserving, so write conformance holds);
|
||||
- **ON_READ** tags such pages as `PageShape.TYPED_RECORD`;
|
||||
- **ON_PROFILE** raises the shard's profile with the `structured-payload` verb (E-5).
|
||||
|
||||
With the extension **inactive**, the kernel treats the same page as opaque prose — the feature
|
||||
is genuinely absent (honest profile). This is "activate only what you need" in action.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import dataclasses
|
||||
from collections.abc import Iterable, Mapping
|
||||
from typing import Any
|
||||
|
||||
from shard_wiki.engine.extension import Extension, Hook
|
||||
from shard_wiki.engine.profile import ProfileContribution
|
||||
from shard_wiki.model import Page, PageShape, Verb
|
||||
|
||||
__all__ = ["StructExt", "parse_frontmatter"]
|
||||
|
||||
|
||||
def parse_frontmatter(body: str) -> tuple[dict[str, str], bool]:
|
||||
"""Parse a leading ``---`` … ``---`` block of ``key: value`` lines.
|
||||
|
||||
Returns ``(fields, has_block)``. An unterminated opening ``---`` is *not* a valid block.
|
||||
"""
|
||||
lines = body.splitlines()
|
||||
if not lines or lines[0].strip() != "---":
|
||||
return {}, False
|
||||
fields: dict[str, str] = {}
|
||||
for line in lines[1:]:
|
||||
if line.strip() == "---":
|
||||
return fields, True
|
||||
if ":" in line:
|
||||
key, _, value = line.partition(":")
|
||||
fields[key.strip()] = value.strip()
|
||||
return {}, False # no closing fence → not a frontmatter block
|
||||
|
||||
|
||||
class StructExt(Extension):
|
||||
id = "ext.struct"
|
||||
declares_types = ("record",)
|
||||
provides = ("capability.wiki.page-model",)
|
||||
|
||||
def __init__(self, allowed_fields: Iterable[str] | None = None) -> None:
|
||||
self._allowed: set[str] | None = set(allowed_fields) if allowed_fields is not None else None
|
||||
|
||||
def hooks(self) -> Mapping[Hook, Any]:
|
||||
return {
|
||||
Hook.ON_WRITE: self._on_write,
|
||||
Hook.ON_READ: self._on_read,
|
||||
Hook.ON_PROFILE: self._on_profile,
|
||||
}
|
||||
|
||||
def _on_write(self, body: str, ctx: Any) -> str:
|
||||
fields, has_block = parse_frontmatter(body)
|
||||
if has_block and self._allowed is not None:
|
||||
disallowed = set(fields) - self._allowed
|
||||
if disallowed:
|
||||
raise ValueError(f"ext.struct: disallowed fields {sorted(disallowed)}")
|
||||
return body # structure stays in-text (git-diffable); body unchanged
|
||||
|
||||
def _on_read(self, page: Page, ctx: Any) -> Page:
|
||||
_, has_block = parse_frontmatter(page.body)
|
||||
return dataclasses.replace(page, shape=PageShape.TYPED_RECORD) if has_block else page
|
||||
|
||||
def _on_profile(self, payload: Any, ctx: Any) -> ProfileContribution:
|
||||
return ProfileContribution(verbs_add=frozenset({Verb.STRUCTURED_PAYLOAD}))
|
||||
|
||||
@staticmethod
|
||||
def fields(body: str) -> dict[str, str]:
|
||||
"""Parsed structured fields of a page body (empty if it has no frontmatter block)."""
|
||||
return parse_frontmatter(body)[0]
|
||||
87
src/shard_wiki/engine/kernel.py
Normal file
87
src/shard_wiki/engine/kernel.py
Normal file
@@ -0,0 +1,87 @@
|
||||
"""Engine kernel — the small page-store core (WikiEngineCoreArchitecture §3, EC-1…EC-4).
|
||||
|
||||
The irreducible engine: author/read/edit pages (edit = a new version; delete = a recoverable
|
||||
tombstone — history is the floor, I-10), enumerate keys, and resolve `[[wikilinks]]` (red-link =
|
||||
an unresolved target). No feature beyond this c2-minimum lives in the kernel; everything else is
|
||||
a typed extension (E-3).
|
||||
|
||||
Storage is intentionally simple here (in-memory version history); the git-IS-store backing
|
||||
(SHARD-WP-0009/0012) slots in behind the same API later. The kernel reuses the page model and
|
||||
provenance leaf; it does not redefine them.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
from collections.abc import Iterable
|
||||
from datetime import datetime, timezone
|
||||
|
||||
from shard_wiki.engine.links import extract_wikilinks
|
||||
from shard_wiki.model import Identity, Page, Placement
|
||||
from shard_wiki.provenance import Liveness, ProvenanceEnvelope, Staleness
|
||||
|
||||
__all__ = ["EngineKernel"]
|
||||
|
||||
|
||||
class EngineKernel:
|
||||
"""An in-process page store with per-page version history for one engine shard."""
|
||||
|
||||
def __init__(self, shard_id: str) -> None:
|
||||
self.shard_id = shard_id
|
||||
self._versions: dict[str, list[Page]] = {}
|
||||
self._deleted: set[str] = set()
|
||||
|
||||
# --- write path (create/edit are one operation; both append a version) ---
|
||||
def write(self, key: str, body: str) -> Page:
|
||||
versions = self._versions.setdefault(key, [])
|
||||
rev = str(len(versions) + 1)
|
||||
page = Page(
|
||||
identity=Identity(self.shard_id, key),
|
||||
body=body,
|
||||
envelope=ProvenanceEnvelope(
|
||||
source_shard=self.shard_id,
|
||||
liveness=Liveness.STATIC,
|
||||
staleness=Staleness.FRESH,
|
||||
source_rev=rev,
|
||||
observed_at=datetime.now(tz=timezone.utc),
|
||||
),
|
||||
placements=(Placement(self.shard_id, key),),
|
||||
)
|
||||
versions.append(page)
|
||||
self._deleted.discard(key)
|
||||
return page
|
||||
|
||||
# --- read path ---
|
||||
def exists(self, key: str) -> bool:
|
||||
return key in self._versions and key not in self._deleted
|
||||
|
||||
def read(self, key: str) -> Page:
|
||||
"""Latest version of a live page. Raises ``KeyError`` if absent or deleted."""
|
||||
if not self.exists(key):
|
||||
raise KeyError(key)
|
||||
return self._versions[key][-1]
|
||||
|
||||
def keys(self) -> Iterable[str]:
|
||||
return (k for k in sorted(self._versions) if k not in self._deleted)
|
||||
|
||||
def current_rev(self, key: str) -> str | None:
|
||||
return self._versions[key][-1].envelope.source_rev if self.exists(key) else None
|
||||
|
||||
# --- history & recoverability (I-10): versions are retained across delete ---
|
||||
def history(self, key: str) -> tuple[Page, ...]:
|
||||
"""All versions ever written for ``key`` (oldest→newest), even after delete."""
|
||||
return tuple(self._versions.get(key, ()))
|
||||
|
||||
def delete(self, key: str) -> None:
|
||||
"""Tombstone a page (history retained; restore by writing again)."""
|
||||
if key not in self._versions:
|
||||
raise KeyError(key)
|
||||
self._deleted.add(key)
|
||||
|
||||
# --- links (EC-4): resolution + red-link detection within this shard ---
|
||||
def links(self, key: str) -> list[str]:
|
||||
"""Wikilink targets in a page's current body."""
|
||||
return extract_wikilinks(self.read(key).body)
|
||||
|
||||
def resolve_link(self, target: str) -> Identity | None:
|
||||
"""Resolve a wikilink target to a live page identity, or ``None`` (a **red-link**)."""
|
||||
return self.read(target).identity if self.exists(target) else None
|
||||
25
src/shard_wiki/engine/links.py
Normal file
25
src/shard_wiki/engine/links.py
Normal file
@@ -0,0 +1,25 @@
|
||||
"""Wikilink extraction — the kernel's link primitive (WikiEngineCoreArchitecture EC-4).
|
||||
|
||||
`[[Target]]` and `[[Target|label]]`. CamelCase auto-linking is intentionally NOT here (it is an
|
||||
opt-in concern per FederationRequirements ADR-06); the kernel only knows explicit wikilinks.
|
||||
Link *resolution* (and red-link detection) is the kernel's job (it knows which keys exist);
|
||||
*rendering* is a consumer concern (headless engine, no UI).
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import re
|
||||
|
||||
__all__ = ["extract_wikilinks"]
|
||||
|
||||
_WIKILINK = re.compile(r"\[\[([^\]|]+?)(?:\|[^\]]*)?\]\]")
|
||||
|
||||
|
||||
def extract_wikilinks(body: str) -> list[str]:
|
||||
"""Return the ordered, de-duplicated wikilink targets in ``body`` (label part dropped)."""
|
||||
seen: dict[str, None] = {}
|
||||
for m in _WIKILINK.finditer(body):
|
||||
target = m.group(1).strip()
|
||||
if target:
|
||||
seen.setdefault(target, None)
|
||||
return list(seen)
|
||||
112
src/shard_wiki/engine/profile.py
Normal file
112
src/shard_wiki/engine/profile.py
Normal file
@@ -0,0 +1,112 @@
|
||||
"""Capability profile derived from active extensions (WikiEngineCoreArchitecture E-5).
|
||||
|
||||
The engine's §A `CapabilityProfile` is **computed**, not hand-set: start from the kernel base
|
||||
profile, then fold each active extension's `on_profile` contribution (in the runtime's
|
||||
deterministic order), then `validate()`. This realizes the chain *configuration (which
|
||||
extensions) → capability (the profile) → conformance* — activating an extension raises the
|
||||
shard's advertised capabilities, and composition can never yield an impossible profile (validate
|
||||
rejects it, §6.5).
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import dataclasses
|
||||
from dataclasses import dataclass
|
||||
|
||||
from shard_wiki.engine.extension import ActiveExtensions, Hook
|
||||
from shard_wiki.model import (
|
||||
AccessGrant,
|
||||
Addressing,
|
||||
AttachmentMode,
|
||||
CapabilityProfile,
|
||||
ContentOpacity,
|
||||
History,
|
||||
MergeModel,
|
||||
NativeQuery,
|
||||
OperationalEnvelope,
|
||||
Substrate,
|
||||
Translation,
|
||||
Verb,
|
||||
WriteGranularity,
|
||||
)
|
||||
from shard_wiki.provenance import Liveness
|
||||
|
||||
__all__ = ["engine_base_profile", "ProfileContribution", "derive_profile"]
|
||||
|
||||
# Profile fields an extension may *raise* via on_profile (substrate/attachment are kernel-fixed).
|
||||
_OVERRIDABLE = (
|
||||
"write_granularity",
|
||||
"content_opacity",
|
||||
"liveness",
|
||||
"history",
|
||||
"merge_model",
|
||||
"addressing",
|
||||
"native_query",
|
||||
"translation",
|
||||
"access_grant",
|
||||
)
|
||||
|
||||
|
||||
def engine_base_profile() -> CapabilityProfile:
|
||||
"""The kernel-only (no extensions) capability profile — the c2-minimum engine shard."""
|
||||
return CapabilityProfile(
|
||||
substrate=Substrate.FILES,
|
||||
attachment_mode=AttachmentMode.IN_ENGINE_HOST,
|
||||
write_granularity=WriteGranularity.PER_PAGE,
|
||||
content_opacity=ContentOpacity.TRANSPARENT,
|
||||
operational_envelope=OperationalEnvelope.LOCAL_UNBOUNDED,
|
||||
access_grant=AccessGrant.OPEN,
|
||||
liveness=Liveness.STATIC,
|
||||
history=History.INTERNAL_ONLY,
|
||||
merge_model=MergeModel.NONE,
|
||||
addressing=Addressing.PATH,
|
||||
native_query=NativeQuery.NONE,
|
||||
translation=Translation.NATIVE,
|
||||
supported_verbs=frozenset({Verb.READ, Verb.WRITE}),
|
||||
).validate()
|
||||
|
||||
|
||||
@dataclass(frozen=True, slots=True)
|
||||
class ProfileContribution:
|
||||
"""An extension's contribution to the derived profile (returned from its ON_PROFILE hook).
|
||||
|
||||
A non-``None`` axis overrides that axis; ``verbs_add`` are unioned in. Order = the runtime's
|
||||
deterministic dispatch order, so later extensions win on a contested axis."""
|
||||
|
||||
write_granularity: WriteGranularity | None = None
|
||||
content_opacity: ContentOpacity | None = None
|
||||
liveness: Liveness | None = None
|
||||
history: History | None = None
|
||||
merge_model: MergeModel | None = None
|
||||
addressing: Addressing | None = None
|
||||
native_query: NativeQuery | None = None
|
||||
translation: Translation | None = None
|
||||
access_grant: AccessGrant | None = None
|
||||
verbs_add: frozenset[Verb] = frozenset()
|
||||
|
||||
|
||||
def derive_profile(
|
||||
active: ActiveExtensions, base: CapabilityProfile | None = None
|
||||
) -> CapabilityProfile:
|
||||
"""Fold active extensions' ON_PROFILE contributions onto ``base`` and validate the result.
|
||||
|
||||
Raises :class:`~shard_wiki.model.ProfileError` if the composed profile is impossible — so an
|
||||
activation set can never advertise an invalid capability profile.
|
||||
"""
|
||||
profile = base or engine_base_profile()
|
||||
contributions = active.dispatch_collect(Hook.ON_PROFILE)
|
||||
|
||||
overrides: dict[str, object] = {}
|
||||
verbs: set[Verb] = set(profile.supported_verbs)
|
||||
for contrib in contributions:
|
||||
if not isinstance(contrib, ProfileContribution):
|
||||
continue
|
||||
for field_name in _OVERRIDABLE:
|
||||
value = getattr(contrib, field_name)
|
||||
if value is not None:
|
||||
overrides[field_name] = value
|
||||
verbs |= set(contrib.verbs_add)
|
||||
|
||||
return dataclasses.replace(
|
||||
profile, supported_verbs=frozenset(verbs), **overrides
|
||||
).validate()
|
||||
46
src/shard_wiki/incremental/__init__.py
Normal file
46
src/shard_wiki/incremental/__init__.py
Normal file
@@ -0,0 +1,46 @@
|
||||
"""incremental/ — the incremental-first derived tier (CoreArchitectureBlueprint §8.7).
|
||||
|
||||
Equivalence is **indexed** (blocking/LSH + verify), not pairwise O(N²); maintenance is
|
||||
**change-driven** (delta with retraction + propagation, review B-4), keeping the derived tier equal
|
||||
to a from-scratch rebuild — which becomes a bounded fallback, not the operational path. A
|
||||
Merkle-style **digest** plus a background **consistency-checker** make ``derived = f(canonical)``
|
||||
verified rather than asserted (I-2), self-healing on detected drift.
|
||||
|
||||
In-memory only for this slice (no persisted index store); per-partition structure is honoured but
|
||||
multi-tenant deployment is later. Per the dependency rule this imports down (model/provenance) and
|
||||
is wired by the orchestrator.
|
||||
"""
|
||||
|
||||
from shard_wiki.incremental.equivalence import (
|
||||
EquivalenceEdge,
|
||||
EquivalenceIndex,
|
||||
normalized_title,
|
||||
)
|
||||
from shard_wiki.incremental.minhash import (
|
||||
MinHasher,
|
||||
band_keys,
|
||||
jaccard,
|
||||
shingles,
|
||||
)
|
||||
from shard_wiki.incremental.union_index import UnionIndex
|
||||
from shard_wiki.incremental.verification import (
|
||||
ConsistencyChecker,
|
||||
ConsistencyReport,
|
||||
derived_digest,
|
||||
region_digest,
|
||||
)
|
||||
|
||||
__all__ = [
|
||||
"shingles",
|
||||
"MinHasher",
|
||||
"band_keys",
|
||||
"jaccard",
|
||||
"EquivalenceEdge",
|
||||
"EquivalenceIndex",
|
||||
"normalized_title",
|
||||
"derived_digest",
|
||||
"region_digest",
|
||||
"ConsistencyReport",
|
||||
"ConsistencyChecker",
|
||||
"UnionIndex",
|
||||
]
|
||||
225
src/shard_wiki/incremental/equivalence.py
Normal file
225
src/shard_wiki/incremental/equivalence.py
Normal file
@@ -0,0 +1,225 @@
|
||||
"""Indexed equivalence — blocking + verify, incrementally maintained (SHARD-WP-0011 T1/T2).
|
||||
|
||||
Equivalence (two *distinct* identities holding the same page) is detected without pairwise O(N²):
|
||||
|
||||
1. **Blocking** generates candidate pairs — pages sharing a normalized-title bucket or an LSH band
|
||||
(MinHash over content shingles).
|
||||
2. **Verify** confirms a candidate — exact-body fingerprint match, or shingle Jaccard ≥ threshold —
|
||||
plus **curator bindings** (explicit decision-log edges) which are always equivalence edges.
|
||||
|
||||
The index is **incrementally maintained** (T2): ``add`` / ``update`` / ``remove`` re-bucket the
|
||||
changed page, **retract** the edges it leaves and **add** the edges it enters; equivalence groups
|
||||
are the connected components of the current edge set, so a retraction that disconnects a component
|
||||
**splits** a chorus automatically. A full :meth:`build` is just repeated ``add`` — the bounded
|
||||
rebuild fallback. The invariant (and the test oracle): incremental state == a from-scratch rebuild.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import hashlib
|
||||
import re
|
||||
from collections.abc import Iterable
|
||||
from dataclasses import dataclass
|
||||
|
||||
from shard_wiki.incremental.minhash import MinHasher, band_keys, jaccard, shingles
|
||||
from shard_wiki.model import Identity, Page
|
||||
|
||||
__all__ = ["EquivalenceEdge", "EquivalenceIndex", "normalized_title"]
|
||||
|
||||
_NONALNUM_RE = re.compile(r"[^a-z0-9]+")
|
||||
|
||||
|
||||
def normalized_title(key: str) -> str:
|
||||
"""A blocking bucket key: the last path segment, lowercased, stripped of non-alphanumerics."""
|
||||
leaf = key.rsplit("/", 1)[-1]
|
||||
return _NONALNUM_RE.sub("", leaf.lower())
|
||||
|
||||
|
||||
@dataclass(frozen=True, slots=True)
|
||||
class EquivalenceEdge:
|
||||
"""A verified equivalence between two identities, tagged with why it was accepted."""
|
||||
|
||||
a: Identity
|
||||
b: Identity
|
||||
reason: str # "fingerprint" | "content" | "curator"
|
||||
|
||||
|
||||
@dataclass(frozen=True, slots=True)
|
||||
class _Entry:
|
||||
shingle_set: frozenset[str]
|
||||
bands: tuple[tuple[int, tuple[int, ...]], ...]
|
||||
title: str
|
||||
fingerprint: str
|
||||
|
||||
|
||||
def _fingerprint(body: str) -> str:
|
||||
return hashlib.blake2b(body.strip().encode("utf-8"), digest_size=16).hexdigest()
|
||||
|
||||
|
||||
def _pair(a: Identity, b: Identity) -> frozenset[Identity]:
|
||||
return frozenset((a, b))
|
||||
|
||||
|
||||
class EquivalenceIndex:
|
||||
"""An incrementally maintained, blocked-and-verified equivalence relation over union pages."""
|
||||
|
||||
def __init__(
|
||||
self,
|
||||
*,
|
||||
num_perm: int = 64,
|
||||
num_bands: int = 32,
|
||||
threshold: float = 0.7,
|
||||
hasher: MinHasher | None = None,
|
||||
) -> None:
|
||||
self.threshold = threshold
|
||||
self.num_bands = num_bands
|
||||
self._hasher = hasher or MinHasher(num_perm=num_perm)
|
||||
self._entries: dict[Identity, _Entry] = {}
|
||||
self._band_buckets: dict[tuple[int, tuple[int, ...]], set[Identity]] = {}
|
||||
self._title_buckets: dict[str, set[Identity]] = {}
|
||||
self._content_edges: dict[frozenset[Identity], str] = {}
|
||||
self._curator_edges: set[frozenset[Identity]] = set()
|
||||
|
||||
# -- build / maintain ----------------------------------------------------
|
||||
|
||||
def build(
|
||||
self,
|
||||
pages: Iterable[Page],
|
||||
curator_edges: Iterable[tuple[Identity, Identity]] = (),
|
||||
) -> None:
|
||||
"""Rebuild from scratch (the bounded fallback): add every page, then curator edges."""
|
||||
self.__init__(
|
||||
num_bands=self.num_bands, threshold=self.threshold, hasher=self._hasher
|
||||
)
|
||||
for page in pages:
|
||||
self.add(page)
|
||||
for a, b in curator_edges:
|
||||
self.bind(a, b)
|
||||
|
||||
def add(self, page: Page) -> None:
|
||||
"""Index a new (or, via :meth:`update`, refreshed) page and add its equivalence edges."""
|
||||
identity = page.identity
|
||||
entry = self._make_entry(page)
|
||||
self._entries[identity] = entry
|
||||
for key in entry.bands:
|
||||
self._band_buckets.setdefault(key, set()).add(identity)
|
||||
self._title_buckets.setdefault(entry.title, set()).add(identity)
|
||||
|
||||
for candidate in self._candidates(identity, entry):
|
||||
reason = self._verify(identity, candidate)
|
||||
if reason is not None:
|
||||
self._content_edges[_pair(identity, candidate)] = reason
|
||||
|
||||
def remove(self, identity: Identity) -> None:
|
||||
"""Drop a page: de-bucket it and retract every content edge incident to it."""
|
||||
entry = self._entries.pop(identity, None)
|
||||
if entry is None:
|
||||
return
|
||||
for key in entry.bands:
|
||||
self._discard_bucket(self._band_buckets, key, identity)
|
||||
self._discard_bucket(self._title_buckets, entry.title, identity)
|
||||
for edge in [e for e in self._content_edges if identity in e]:
|
||||
del self._content_edges[edge]
|
||||
|
||||
def update(self, page: Page) -> None:
|
||||
"""Apply a change as retract-then-add: stale (bucket-exit) edges go, new edges arrive."""
|
||||
self.remove(page.identity)
|
||||
self.add(page)
|
||||
|
||||
def bind(self, a: Identity, b: Identity) -> None:
|
||||
"""Record a curator equivalence (an explicit decision-log binding); always an edge."""
|
||||
if a != b:
|
||||
self._curator_edges.add(_pair(a, b))
|
||||
|
||||
def unbind(self, a: Identity, b: Identity) -> None:
|
||||
self._curator_edges.discard(_pair(a, b))
|
||||
|
||||
def set_curator_edges(self, edges: Iterable[tuple[Identity, Identity]]) -> None:
|
||||
"""Replace all curator edges at once (re-syncing from the decision-log fold)."""
|
||||
self._curator_edges = {_pair(a, b) for a, b in edges if a != b}
|
||||
|
||||
# -- queries -------------------------------------------------------------
|
||||
|
||||
def identities(self) -> frozenset[Identity]:
|
||||
"""All identities currently present in the index."""
|
||||
return frozenset(self._entries)
|
||||
|
||||
def fingerprint(self, identity: Identity) -> str | None:
|
||||
"""The content fingerprint indexed for ``identity`` (None if absent) — a digest leaf."""
|
||||
entry = self._entries.get(identity)
|
||||
return entry.fingerprint if entry is not None else None
|
||||
|
||||
def edges(self) -> frozenset[frozenset[Identity]]:
|
||||
"""All equivalence edges (content + curator) among currently present identities."""
|
||||
present = self._entries.keys()
|
||||
curator = {e for e in self._curator_edges if e <= present}
|
||||
return frozenset(set(self._content_edges) | curator)
|
||||
|
||||
def groups(self) -> tuple[frozenset[Identity], ...]:
|
||||
"""Equivalence groups: connected components of size ≥ 2 (union-find over the edges)."""
|
||||
parent: dict[Identity, Identity] = {}
|
||||
|
||||
def find(x: Identity) -> Identity:
|
||||
parent.setdefault(x, x)
|
||||
root = x
|
||||
while parent[root] != root:
|
||||
root = parent[root]
|
||||
while parent[x] != root:
|
||||
parent[x], x = root, parent[x]
|
||||
return root
|
||||
|
||||
for edge in self.edges():
|
||||
a, b = tuple(edge)
|
||||
ra, rb = find(a), find(b)
|
||||
if ra != rb:
|
||||
parent[ra] = rb
|
||||
|
||||
comps: dict[Identity, set[Identity]] = {}
|
||||
for node in parent:
|
||||
comps.setdefault(find(node), set()).add(node)
|
||||
return tuple(
|
||||
frozenset(members) for members in comps.values() if len(members) > 1
|
||||
)
|
||||
|
||||
def equivalent_to(self, identity: Identity) -> frozenset[Identity]:
|
||||
"""The equivalence group containing ``identity`` (including itself), else just itself."""
|
||||
for group in self.groups():
|
||||
if identity in group:
|
||||
return group
|
||||
return frozenset({identity})
|
||||
|
||||
# -- internals -----------------------------------------------------------
|
||||
|
||||
def _make_entry(self, page: Page) -> _Entry:
|
||||
shingle_set = shingles(page.body)
|
||||
signature = self._hasher.signature(shingle_set)
|
||||
return _Entry(
|
||||
shingle_set=shingle_set,
|
||||
bands=band_keys(signature, self.num_bands),
|
||||
title=normalized_title(page.identity.key),
|
||||
fingerprint=_fingerprint(page.body),
|
||||
)
|
||||
|
||||
def _candidates(self, identity: Identity, entry: _Entry) -> set[Identity]:
|
||||
candidates: set[Identity] = set()
|
||||
for key in entry.bands:
|
||||
candidates |= self._band_buckets.get(key, set())
|
||||
candidates |= self._title_buckets.get(entry.title, set())
|
||||
candidates.discard(identity)
|
||||
return candidates
|
||||
|
||||
def _verify(self, a: Identity, b: Identity) -> str | None:
|
||||
ea, eb = self._entries[a], self._entries[b]
|
||||
if ea.fingerprint == eb.fingerprint:
|
||||
return "fingerprint"
|
||||
if jaccard(ea.shingle_set, eb.shingle_set) >= self.threshold:
|
||||
return "content"
|
||||
return None
|
||||
|
||||
@staticmethod
|
||||
def _discard_bucket(buckets: dict, key, identity: Identity) -> None:
|
||||
bucket = buckets.get(key)
|
||||
if bucket is not None:
|
||||
bucket.discard(identity)
|
||||
if not bucket:
|
||||
del buckets[key]
|
||||
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user