# WikiEngineCoreArchitecture Status: **draft for review** · Date: 2026-06-15 · Deliverable of **SHARD-WP-0013 T5** The architecture of shard-wiki's **native reference wiki-engine**: a **headless, API-first** engine — a **small core** plus a **stringent typed-extension framework** — that addresses the whole use-case catalogue, mediates conflicting requirements into one integrated featureset, and lets each shard **activate only what it needs**. Authoritative as of the ratified INTENT amendment (2026-06-15, decision `84ffdb48`): the engine is **additive** and is shard-wiki's **reference first-party shard backend (a canonical-mode shard)** — not a replacement for other engines, not a UI. Relation to other specs (referenced, not restated): - `CoreArchitectureBlueprint.md` — the orchestrator/whole-system architecture. **The engine is one shard behind §A; federation, union, projection, and cross-shard coordination are the orchestrator's job, not the engine's.** That is what keeps the engine small. - `TechnicalSpecificationDocument.md §A` — the shard adapter contract the engine implements. - `FederationRequirements.md` — page resolution, overlay, link semantics (ADRs the engine reuses). - `UseCaseCatalog.md` "Capability structure" layer (T2) — the core-vs-extension map + the conflict-mediation map this document realizes. - reuse surface (`capability.wiki.*`, plus consumed `feature-control` / `authorization`). --- ## 1. Thesis: a small page-store kernel; everything else is a typed extension > **The engine is a page-store kernel with a typed-extension runtime. Every capability beyond > the c2-minimum is a *typed extension* a shard activates only if it needs it — and a shard's > externally-visible capability profile is *computed from its active extension set*.** That single chain — **configuration (which extensions) → capability (what the shard can do) → conformance (verified)** — is the whole design. It mirrors the orchestrator's discipline (`CoreArchitectureBlueprint` §6.5: capability-as-data, verified, no per-backend code) and turns "integrated whole, yet activate only what you need" from a slogan into a mechanism. The engine stays small for a structural reason: it is **one shard**, not a federation layer. Union, projection, equivalence, cross-shard overlay-orchestration, and the federation models all live in shard-wiki's orchestrator (the blueprint). The engine implements `ShardAdapter` (§A) and nothing above it. So "wiki engine" here means *a really good single canonical shard with a typed-extension framework and a headless agent-first API* — not a re-implementation of shard-wiki. --- ## 2. Engine invariants | # | Invariant | Why | |---|-----------|-----| | E-1 | **One shard, not a federation layer.** The engine implements `ShardAdapter` (§A); union/projection/federation are the orchestrator's. | Keeps the engine small; no duplication of the blueprint. | | E-2 | **Small kernel.** The kernel is only: page store + history, the page model (reused), the extension runtime, the API. | Common case (a plain wiki) is trivial. | | E-3 | **Everything else is a typed extension.** No feature beyond the c2-minimum is baked into the kernel. | Integrated-whole-yet-selective; testable boundary. | | E-4 | **Per-shard activation.** A shard runs an *activation profile* (a set of extensions + config); unused features cost nothing. | "Activate only what you need." | | E-5 | **Capability profile is derived from active extensions.** The §A profile the engine declares is computed from its activation profile, then conformance-verified. | One source of truth; honest, verified capabilities. | | E-6 | **Headless & API-first.** The API is the only interface; no bundled UI/rendering (consumer concern, L6). | INTENT amendment; clean orchestrator/consumer split. | | E-7 | **Agent-first ergonomics.** The API is typed, introspectable, batchable, low-round-trip. | INTENT: optimized for efficient agent/automation access. | | E-8 | **Reuse over reinvent.** Page model, history/journal, activation, and authz are *consumed* (existing capabilities), not rebuilt. | Smallness; reuse-surface alignment. | | E-9 | **Extensions are typed & verified.** An extension declares its types/hooks/deps; activation is rejected if types conflict or deps are unmet (impossible profiles forbidden). | Stringency; mirrors §6.5 + conformance. | --- ## 3. The kernel (four concepts) The kernel is deliberately four things — nothing more is mandatory. 1. **Page** — the backend-neutral page model (`capability.wiki.page-model`, reused as-is): stable identity ≠ placement, layered provenance, page shapes. The kernel does **not** redefine it; extensions may *register additional shapes/types* (§4). 2. **Store + history** — a git-backed page store (the engine is the *git-IS-store* case from the blueprint): a write is a commit; history is native and recoverable (E-3/I-10). Coordination decisions reuse the event-sourced journal (`capability.wiki.coordination-journal`). 3. **Extension runtime** — the typed-extension registry, hook dispatcher, type checker, and activation engine (§4). *This is the core innovation; it is the only “framework” in the kernel.* 4. **API** — the headless, typed, agent-first surface (§7). Kernel endpoints cover the c2-minimum (page CRUD-as-history, links, history); extensions extend the surface through typed routes. The **c2-minimum** a kernel-only shard delivers (no extensions): write a page, link pages (`[[wikilink]]` + red-link), never lose an edit. That is a complete, useful headless wiki. --- ## 4. The typed-extension model (the framework) An **Extension** is a typed unit declaring a contract the runtime enforces: ``` Extension: id : reverse-domain id (e.g. ext.struct.typed-records) provides : capability ids it realizes (reuse-surface; e.g. capability.wiki.page-model[typed]) types : page shapes / field schemas / content-types it introduces (typed, validated) hooks : kernel lifecycle bindings it implements (see below) api : typed routes it adds to the headless surface depends_on : other extensions / consumed capabilities required conflicts_with: extensions it cannot co-activate with config : declared, schema-checked activation parameters ``` **Hooks (the kernel lifecycle the runtime dispatches):** `on_resolve` (name→page), `on_read`, `on_write` (validate/transform a draft), `on_link` (link/transclusion resolution), `on_history`, `on_query`, `on_render_request` (produce a derived representation for a consumer), `on_profile` (contribute capability-spectrum positions, E-5). Hooks are **typed** (typed inputs/outputs) and dispatched in a **declared, deterministic order**. **Typing & composition (stringency):** - At activation, the runtime builds the **dependency closure**, checks **type consistency** (no two active extensions claim incompatible types for the same page shape/field; `conflicts_with` honoured), and rejects an **impossible profile** — exactly the §6.5 implication-rule discipline, applied to extensions. A rejected profile fails fast at boot, never silently. - Composition is **deterministic**: hook order is declared; conflicts are resolved by explicit precedence or rejection, never by accident. - Extensions ship a **conformance check** (mirrors §6.6): an activated extension is exercised against its declared types/hooks before the shard serves traffic — *typed contracts verified, not trusted*. **Per-shard activation (reuse, not reinvent):** - A shard's **activation profile** = `{extension id → config}`. Activation/evaluation **reuses `capability.feature-control.evaluate`** (helix_forge/feature-control) — shard-wiki does not build a bespoke flagging system (T3 consumption). - **E-5 in action:** the engine's `on_profile` hooks fold the active extensions into the §A **capability profile** the shard advertises to the orchestrator (e.g. activate `ext.struct.typed-records` → the `structure` spectrum rises and `structured-payload` is declared). The profile is then conformance-verified (§A.2). *Configuration → capability → conformance is one chain.* --- ## 5. Featureset map: core vs extensions, and conflict mediation The engine realizes the T2 "Capability structure" layer (`UseCaseCatalog.md`). Mapping (the *page/content-level* clusters; **X-FED and X-ATT are orchestrator concerns, not engine extensions** — E-1): | Engine kernel (always on) | T2 | reuse-surface | |---------------------------|----|---------------| | Page lifecycle, identity/placement, history, links, store | EC-1…EC-5 | `capability.wiki.page-model`, `…coordination-journal`, `…adapter-contract` | | Built-in typed extension | T2 cluster | provides / consumes | default | |--------------------------|-----------|---------------------|---------| | `ext.overlay` | X-OVERLAY | `capability.wiki.overlay` | on (no-op locally) | | `ext.authz` (L0→L4 tiers) | X-AUTHZ | consumes `capability.authorization.policy-evaluate` | L0 | | `ext.views` (BackLinks/RecentChanges/…) | X-VIEW | `capability.wiki.derived-views` | BackLinks/RecentChanges on | | `ext.struct` (typed/computed/graph) | X-STRUCT | `capability.wiki.page-model[typed]` | off | | `ext.addr` (span addr / transclusion / query) | X-ADDR | `capability.wiki.page-model`+query | transclusion on | | `ext.compute` (literate/notebook/program/live) | X-COMP | `capability.wiki.engine-typed-extensions` | off (gated, sandbox) | | `ext.prov` (rich provenance/metadata) | X-PROV | `capability.wiki.page-model[provenance]` | base on | | `ext.collab` (c2 social patterns) | X-COLLAB | (UI/convention; mostly consumer) | off | **Conflict mediation (T2 map) realized by the framework** — every tension is a *mechanism*, not a baked-in choice, so one featureset serves all: | Tension | Realized by | |---------|-------------| | open vs governed | `ext.authz` tiers (additive); kernel history is the floor at L0 | | lossless vs lossy | a `translate` hook + fidelity report (consumes the proposed `capability.content.translation-fidelity`, G2) | | live vs snapshot | `ext.compute`/`ext.addr` mark liveness; degrade to snapshot (never imply live) | | canonical vs chorus | detection in kernel; resolution is a policy preset (orchestrator) | | integrated-whole vs only-what-you-need | **the activation profile** (E-4) + typed composition (§4) — the headline mediation | | minimal vs feature-rich | small kernel (§3) + extensions; nothing beyond c2 is mandatory | --- ## 6. The engine as a canonical-mode shard The engine exposes itself through an `EngineShardAdapter` implementing §A: - **Substrate** git-IS-store; **history** git-native; **write** = commit; `current_rev` = sha (apply-under-drift works out of the box). It is the **most capable shard** shard-wiki can attach — it dogfoods the contract. - Its **capability profile is computed from active extensions** (E-5) and **conformance-verified** (§A.2) — so the orchestrator sees an honest profile, and federation ops degrade by the engine's *actually-activated* capabilities. - The orchestrator attaches it like any shard; **federation/union/projection are not in the engine** (E-1). A standalone deployment is "the engine as the sole canonical shard"; a federated deployment is "the engine as one shard among many." Same engine, no re-architecture. This is the precise realization of the INTENT reconciliation: shard-wiki orchestrates; the engine is the first-party shard it can attach. --- ## 7. Headless API surface & agent ergonomics (E-6/E-7) API-first means the typed API is the product; there is no UI. Agent-first means it is designed for cheap, deterministic machine consumption: - **Typed resource API** over pages, links, history, spans — content-negotiated (raw Markdown, the structured page model, or an extension-rendered representation via `on_render_request`). - **Capability/extension introspection** — an endpoint returns the shard's **active extensions, their types, and the derived §A capability profile**, so an agent can discover *what this shard can do* before acting (no trial-and-error). This is the agent-facing twin of E-5. - **Batch & query** — multi-page reads, link-graph and RecentChanges queries (via `ext.views`), and `on_query` delegation — minimizing round-trips. - **Write via overlay** — edits go through the overlay path (FederationRequirements ADR-05), so agent writes are safe (draft → apply-under-drift) and attributable. - **Deterministic & provenance-carrying** — every response carries the provenance envelope; identical inputs yield identical outputs (no hidden state) — friendly to caching agents. --- ## 8. Implementation sketch (module layout) The engine lives under the shard-wiki package as a backend (it sits at L0/L1 — a shard behind the adapter; nothing in the orchestrator depends *up* on it): ``` src/shard_wiki/engine/ kernel.py # page store + history (git-IS-store), lifecycle; reuses model/, provenance/, coordination/ extension.py # Extension contract, registry, typed hook dispatcher, type checker activation.py # activation profile; reuses capability.feature-control.evaluate profile.py # derive the §A CapabilityProfile from active extensions (E-5) + conformance api.py # headless, typed, agent-first surface (+ extension introspection) adapter.py # EngineShardAdapter implements adapters/ ShardAdapter (canonical-mode shard) extensions/ # built-ins: overlay/ authz/ views/ struct/ addr/ compute/ prov/ collab/ ``` Dependency rule: `engine/` consumes `model/`, `provenance/`, `coordination/`, `adapters/` (contract), `policy/`; it is consumed *only* via its `EngineShardAdapter` (the orchestrator attaches it as a shard). No orchestrator-tier (`union/`, `projection/`) import. --- ## 9. Reuse (what the engine consumes vs registers) - **Consumes:** `capability.feature-control.evaluate` (activation), `capability.authorization. policy-evaluate` (`ext.authz`), the proposed `capability.content.translation-fidelity` (G2, lossy translation), and shard-wiki's own `capability.wiki.{page-model, coordination-journal, adapter-contract, overlay, derived-views}`. - **Registers / realizes:** `capability.wiki.engine-typed-extensions` (this document is its Discovery evidence — D2→D3 on ratification). The cross-cutting **typed-extension framework** pattern is proposed back to the reuse surface as **G1** (`capability.platform.typed-extension- framework`); this engine is its first instance. --- ## 10. Traceability - **INTENT** — realizes the 2026-06-15 amendment (decision `84ffdb48`): headless, API-first, additive native engine = canonical-mode shard backend; honours all engine invariants and the orchestrator boundary (E-1). - **Use cases** — the kernel/extension split *is* the T2 "Capability structure" layer (`UseCaseCatalog.md`); every UC is either kernel (EC-1…EC-5) or a named extension; conflicts use the T2 mediation map (§5). The engine must ultimately cover UC-01–UC-84 (per-shard subsets). - **Architecture** — consistent with `CoreArchitectureBlueprint` (engine = canonical-mode shard, §6 contract, §7 page model, §8.1 journal) and `TechnicalSpecificationDocument §A` (the contract it implements). `FederationRequirements` ADR-05/06 supply overlay + link semantics. - **Reuse surface** — §9; G1/G2 proposals from SHARD-WP-0013 T3. ## 11. Decisions / deferred / open **Decided:** small page-store kernel + typed-extension runtime (E-2/E-3); engine is one shard, not a federation layer (E-1); capability profile derived from active extensions (E-5); headless, API-first, agent-first (E-6/E-7); activation reuses `feature-control` (E-8); extensions are typed + conformance-verified (E-9). **Deferred:** the concrete extension SDK/ABI and hook signatures; the API protocol (REST/GraphQL/ MCP) — agent-first introspection is required, the wire format is an implementation spike; the built-in extensions' internal designs (each is a later workplan). **Open (tracked):** does `ext.compute` ever execute in-process or strictly delegate/snapshot (ties blueprint §8.5 + trust/sandbox); is the typed-extension framework promoted to the reuse-surface platform capability (G1) and then *consumed* here rather than engine-owned; introspection granularity vs. leaking internal structure to agents. ## 12. Stability note The **thesis (§1)** and **invariants (§2)** — especially *engine-is-one-shard* (E-1), *small-kernel/everything-else-typed-extension* (E-2/E-3), and *capability-profile-derived-from- extensions* (E-5) — are load-bearing. Changing them (e.g. moving federation into the engine, or baking a feature into the kernel) is an architectural change in the sense of INTENT's Stability Note and should be rare and deliberate. The headless/API-first posture is fixed by the ratified INTENT amendment.