shard-wiki/spec/WikiEngineCoreArchitecture.md

# WikiEngineCoreArchitecture

Status: **draft for review** · Date: 2026-06-15 · Deliverable of **SHARD-WP-0013 T5**

The architecture of shard-wiki's **native reference wiki-engine**: a **headless, API-first**
engine — a **small core** plus a **stringent typed-extension framework** — that addresses the
whole use-case catalogue, mediates conflicting requirements into one integrated featureset, and
lets each shard **activate only what it needs**. Authoritative as of the ratified INTENT
amendment (2026-06-15, decision `84ffdb48`): the engine is **additive** and is shard-wiki's
**reference first-party shard backend (a canonical-mode shard)** — not a replacement for other
engines, not a UI.

Relation to other specs (referenced, not restated):
- `CoreArchitectureBlueprint.md` — the orchestrator/whole-system architecture. **The engine is
  one shard behind §A; federation, union, projection, and cross-shard coordination are the
  orchestrator's job, not the engine's.** That is what keeps the engine small.
- `TechnicalSpecificationDocument.md §A` — the shard adapter contract the engine implements.
- `FederationRequirements.md` — page resolution, overlay, link semantics (ADRs the engine reuses).
- `UseCaseCatalog.md` "Capability structure" layer (T2) — the core-vs-extension map + the
  conflict-mediation map this document realizes.
- reuse surface (`capability.wiki.*`, plus consumed `feature-control` / `authorization`).

---

## 1. Thesis: a small page-store kernel; everything else is a typed extension

> **The engine is a page-store kernel with a typed-extension runtime. Every capability beyond
> the c2-minimum is a *typed extension* a shard activates only if it needs it — and a shard's
> externally-visible capability profile is *computed from its active extension set*.**

That single chain — **configuration (which extensions) → capability (what the shard can do) →
conformance (verified)** — is the whole design. It mirrors the orchestrator's discipline
(`CoreArchitectureBlueprint` §6.5: capability-as-data, verified, no per-backend code) and turns
"integrated whole, yet activate only what you need" from a slogan into a mechanism.

The engine stays small for a structural reason: it is **one shard**, not a federation layer.
Union, projection, equivalence, cross-shard overlay-orchestration, and the federation models all
live in shard-wiki's orchestrator (the blueprint). The engine implements `ShardAdapter` (§A) and
nothing above it. So "wiki engine" here means *a really good single canonical shard with a
typed-extension framework and a headless agent-first API* — not a re-implementation of shard-wiki.

---

## 2. Engine invariants

| # | Invariant | Why |
|---|-----------|-----|
| E-1 | **One shard, not a federation layer.** The engine implements `ShardAdapter` (§A); union/projection/federation are the orchestrator's. | Keeps the engine small; no duplication of the blueprint. |
| E-2 | **Small kernel.** The kernel is only: page store + history, the page model (reused), the extension runtime, the API. | Common case (a plain wiki) is trivial. |
| E-3 | **Everything else is a typed extension.** No feature beyond the c2-minimum is baked into the kernel. | Integrated-whole-yet-selective; testable boundary. |
| E-4 | **Per-shard activation.** A shard runs an *activation profile* (a set of extensions + config); unused features cost nothing. | "Activate only what you need." |
| E-5 | **Capability profile is derived from active extensions.** The §A profile the engine declares is computed from its activation profile, then conformance-verified. | One source of truth; honest, verified capabilities. |
| E-6 | **Headless & API-first.** The API is the only interface; no bundled UI/rendering (consumer concern, L6). | INTENT amendment; clean orchestrator/consumer split. |
| E-7 | **Agent-first ergonomics.** The API is typed, introspectable, batchable, low-round-trip. | INTENT: optimized for efficient agent/automation access. |
| E-8 | **Reuse over reinvent.** Page model, history/journal, activation, and authz are *consumed* (existing capabilities), not rebuilt. | Smallness; reuse-surface alignment. |
| E-9 | **Extensions are typed & verified.** An extension declares its types/hooks/deps; activation is rejected if types conflict or deps are unmet (impossible profiles forbidden). | Stringency; mirrors §6.5 + conformance. |

---

## 3. The kernel (four concepts)

The kernel is deliberately four things — nothing more is mandatory.

1. **Page** — the backend-neutral page model (`capability.wiki.page-model`, reused as-is):
   stable identity ≠ placement, layered provenance, page shapes. The kernel does **not** redefine
   it; extensions may *register additional shapes/types* (§4).
2. **Store + history** — a git-backed page store (the engine is the *git-IS-store* case from the
   blueprint): a write is a commit; history is native and recoverable (E-3/I-10). Coordination
   decisions reuse the event-sourced journal (`capability.wiki.coordination-journal`).
3. **Extension runtime** — the typed-extension registry, hook dispatcher, type checker, and
   activation engine (§4). *This is the core innovation; it is the only “framework” in the kernel.*
4. **API** — the headless, typed, agent-first surface (§7). Kernel endpoints cover the c2-minimum
   (page CRUD-as-history, links, history); extensions extend the surface through typed routes.

The **c2-minimum** a kernel-only shard delivers (no extensions): write a page, link pages
(`[[wikilink]]` + red-link), never lose an edit. That is a complete, useful headless wiki.

---

## 4. The typed-extension model (the framework)

An **Extension** is a typed unit declaring a contract the runtime enforces:

```
Extension:
  id            : reverse-domain id (e.g. ext.struct.typed-records)
  provides      : capability ids it realizes (reuse-surface; e.g. capability.wiki.page-model[typed])
  types         : page shapes / field schemas / content-types it introduces (typed, validated)
  hooks         : kernel lifecycle bindings it implements (see below)
  api           : typed routes it adds to the headless surface
  depends_on    : other extensions / consumed capabilities required
  conflicts_with: extensions it cannot co-activate with
  config        : declared, schema-checked activation parameters
```

**Hooks (the kernel lifecycle the runtime dispatches):**
`on_resolve` (name→page), `on_read`, `on_write` (validate/transform a draft), `on_link`
(link/transclusion resolution), `on_history`, `on_query`, `on_render_request` (produce a derived
representation for a consumer), `on_profile` (contribute capability-spectrum positions, E-5).
Hooks are **typed** (typed inputs/outputs) and dispatched in a **declared, deterministic order**.

**Typing & composition (stringency):**
- At activation, the runtime builds the **dependency closure**, checks **type consistency** (no
  two active extensions claim incompatible types for the same page shape/field; `conflicts_with`
  honoured), and rejects an **impossible profile** — exactly the §6.5 implication-rule discipline,
  applied to extensions. A rejected profile fails fast at boot, never silently.
- Composition is **deterministic**: hook order is declared; conflicts are resolved by explicit
  precedence or rejection, never by accident.
- Extensions ship a **conformance check** (mirrors §6.6): an activated extension is exercised
  against its declared types/hooks before the shard serves traffic — *typed contracts verified,
  not trusted*.

**Per-shard activation (reuse, not reinvent):**
- A shard's **activation profile** = `{extension id → config}`. Activation/evaluation **reuses
  `capability.feature-control.evaluate`** (helix_forge/feature-control) — shard-wiki does not
  build a bespoke flagging system (T3 consumption).
- **E-5 in action:** the engine's `on_profile` hooks fold the active extensions into the §A
  **capability profile** the shard advertises to the orchestrator (e.g. activate
  `ext.struct.typed-records` → the `structure` spectrum rises and `structured-payload` is
  declared). The profile is then conformance-verified (§A.2). *Configuration → capability →
  conformance is one chain.*

---

## 5. Featureset map: core vs extensions, and conflict mediation

The engine realizes the T2 "Capability structure" layer (`UseCaseCatalog.md`). Mapping (the
*page/content-level* clusters; **X-FED and X-ATT are orchestrator concerns, not engine
extensions** — E-1):

| Engine kernel (always on) | T2 | reuse-surface |
|---------------------------|----|---------------|
| Page lifecycle, identity/placement, history, links, store | EC-1…EC-5 | `capability.wiki.page-model`, `…coordination-journal`, `…adapter-contract` |

| Built-in typed extension | T2 cluster | provides / consumes | default |
|--------------------------|-----------|---------------------|---------|
| `ext.overlay` | X-OVERLAY | `capability.wiki.overlay` | on (no-op locally) |
| `ext.authz` (L0→L4 tiers) | X-AUTHZ | consumes `capability.authorization.policy-evaluate` | L0 |
| `ext.views` (BackLinks/RecentChanges/…) | X-VIEW | `capability.wiki.derived-views` | BackLinks/RecentChanges on |
| `ext.struct` (typed/computed/graph) | X-STRUCT | `capability.wiki.page-model[typed]` | off |
| `ext.addr` (span addr / transclusion / query) | X-ADDR | `capability.wiki.page-model`+query | transclusion on |
| `ext.compute` (literate/notebook/program/live) | X-COMP | `capability.wiki.engine-typed-extensions` | off (gated, sandbox) |
| `ext.prov` (rich provenance/metadata) | X-PROV | `capability.wiki.page-model[provenance]` | base on |
| `ext.collab` (c2 social patterns) | X-COLLAB | (UI/convention; mostly consumer) | off |

**Conflict mediation (T2 map) realized by the framework** — every tension is a *mechanism*, not a
baked-in choice, so one featureset serves all:

| Tension | Realized by |
|---------|-------------|
| open vs governed | `ext.authz` tiers (additive); kernel history is the floor at L0 |
| lossless vs lossy | a `translate` hook + fidelity report (consumes the proposed `capability.content.translation-fidelity`, G2) |
| live vs snapshot | `ext.compute`/`ext.addr` mark liveness; degrade to snapshot (never imply live) |
| canonical vs chorus | detection in kernel; resolution is a policy preset (orchestrator) |
| integrated-whole vs only-what-you-need | **the activation profile** (E-4) + typed composition (§4) — the headline mediation |
| minimal vs feature-rich | small kernel (§3) + extensions; nothing beyond c2 is mandatory |

---

## 6. The engine as a canonical-mode shard

The engine exposes itself through an `EngineShardAdapter` implementing §A:

- **Substrate** git-IS-store; **history** git-native; **write** = commit; `current_rev` = sha
  (apply-under-drift works out of the box). It is the **most capable shard** shard-wiki can
  attach — it dogfoods the contract.
- Its **capability profile is computed from active extensions** (E-5) and **conformance-verified**
  (§A.2) — so the orchestrator sees an honest profile, and federation ops degrade by the engine's
  *actually-activated* capabilities.
- The orchestrator attaches it like any shard; **federation/union/projection are not in the
  engine** (E-1). A standalone deployment is "the engine as the sole canonical shard"; a
  federated deployment is "the engine as one shard among many." Same engine, no re-architecture.

This is the precise realization of the INTENT reconciliation: shard-wiki orchestrates; the engine
is the first-party shard it can attach.

---

## 7. Headless API surface & agent ergonomics (E-6/E-7)

API-first means the typed API is the product; there is no UI. Agent-first means it is designed
for cheap, deterministic machine consumption:

- **Typed resource API** over pages, links, history, spans — content-negotiated (raw Markdown,
  the structured page model, or an extension-rendered representation via `on_render_request`).
- **Capability/extension introspection** — an endpoint returns the shard's **active extensions,
  their types, and the derived §A capability profile**, so an agent can discover *what this shard
  can do* before acting (no trial-and-error). This is the agent-facing twin of E-5.
- **Batch & query** — multi-page reads, link-graph and RecentChanges queries (via `ext.views`),
  and `on_query` delegation — minimizing round-trips.
- **Write via overlay** — edits go through the overlay path (FederationRequirements ADR-05), so
  agent writes are safe (draft → apply-under-drift) and attributable.
- **Deterministic & provenance-carrying** — every response carries the provenance envelope;
  identical inputs yield identical outputs (no hidden state) — friendly to caching agents.

---

## 8. Implementation sketch (module layout)

The engine lives under the shard-wiki package as a backend (it sits at L0/L1 — a shard behind the
adapter; nothing in the orchestrator depends *up* on it):

```
src/shard_wiki/engine/
  kernel.py        # page store + history (git-IS-store), lifecycle; reuses model/, provenance/, coordination/
  extension.py     # Extension contract, registry, typed hook dispatcher, type checker
  activation.py    # activation profile; reuses capability.feature-control.evaluate
  profile.py       # derive the §A CapabilityProfile from active extensions (E-5) + conformance
  api.py           # headless, typed, agent-first surface (+ extension introspection)
  adapter.py       # EngineShardAdapter implements adapters/ ShardAdapter (canonical-mode shard)
  extensions/      # built-ins: overlay/ authz/ views/ struct/ addr/ compute/ prov/ collab/
```

Dependency rule: `engine/` consumes `model/`, `provenance/`, `coordination/`, `adapters/`
(contract), `policy/`; it is consumed *only* via its `EngineShardAdapter` (the orchestrator
attaches it as a shard). No orchestrator-tier (`union/`, `projection/`) import.

---

## 9. Reuse (what the engine consumes vs registers)

- **Consumes:** `capability.feature-control.evaluate` (activation), `capability.authorization.
  policy-evaluate` (`ext.authz`), the proposed `capability.content.translation-fidelity` (G2,
  lossy translation), and shard-wiki's own `capability.wiki.{page-model, coordination-journal,
  adapter-contract, overlay, derived-views}`.
- **Registers / realizes:** `capability.wiki.engine-typed-extensions` (this document is its
  Discovery evidence — D2→D3 on ratification). The cross-cutting **typed-extension framework**
  pattern is proposed back to the reuse surface as **G1** (`capability.platform.typed-extension-
  framework`); this engine is its first instance.

---

## 10. Traceability

- **INTENT** — realizes the 2026-06-15 amendment (decision `84ffdb48`): headless, API-first,
  additive native engine = canonical-mode shard backend; honours all engine invariants and the
  orchestrator boundary (E-1).
- **Use cases** — the kernel/extension split *is* the T2 "Capability structure" layer
  (`UseCaseCatalog.md`); every UC is either kernel (EC-1…EC-5) or a named extension; conflicts
  use the T2 mediation map (§5). The engine must ultimately cover UC-01–UC-84 (per-shard subsets).
- **Architecture** — consistent with `CoreArchitectureBlueprint` (engine = canonical-mode shard,
  §6 contract, §7 page model, §8.1 journal) and `TechnicalSpecificationDocument §A` (the contract
  it implements). `FederationRequirements` ADR-05/06 supply overlay + link semantics.
- **Reuse surface** — §9; G1/G2 proposals from SHARD-WP-0013 T3.

## 11. Decisions / deferred / open

**Decided:** small page-store kernel + typed-extension runtime (E-2/E-3); engine is one shard,
not a federation layer (E-1); capability profile derived from active extensions (E-5); headless,
API-first, agent-first (E-6/E-7); activation reuses `feature-control` (E-8); extensions are
typed + conformance-verified (E-9).

**Deferred:** the concrete extension SDK/ABI and hook signatures; the API protocol (REST/GraphQL/
MCP) — agent-first introspection is required, the wire format is an implementation spike; the
built-in extensions' internal designs (each is a later workplan).

**Open (tracked):** does `ext.compute` ever execute in-process or strictly delegate/snapshot
(ties blueprint §8.5 + trust/sandbox); is the typed-extension framework promoted to the
reuse-surface platform capability (G1) and then *consumed* here rather than engine-owned;
introspection granularity vs. leaking internal structure to agents.

## 12. Stability note

The **thesis (§1)** and **invariants (§2)** — especially *engine-is-one-shard* (E-1),
*small-kernel/everything-else-typed-extension* (E-2/E-3), and *capability-profile-derived-from-
extensions* (E-5) — are load-bearing. Changing them (e.g. moving federation into the engine, or
baking a feature into the kernel) is an architectural change in the sense of INTENT's Stability
Note and should be rare and deliberate. The headless/API-first posture is fixed by the ratified
INTENT amendment.