Files
shard-wiki/spec/WikiEngineCoreArchitecture.md
tegwick 0ee972f2e2 spec(SHARD-WP-0013 T5): WikiEngineCoreArchitecture.md — small core + typed extensions
Headless, API-first, agent-optimized native engine = canonical-mode shard backend.
Thesis: a page-store kernel with a typed-extension runtime; everything beyond the
c2-minimum is a typed extension activated per shard, and the shard's §A capability
profile is DERIVED from its active extensions (configuration->capability->conformance).
9 engine invariants (engine-is-one-shard, small kernel, per-shard activation,
profile-from-extensions, headless/agent-first, reuse-not-reinvent, typed+verified).
Kernel (4 concepts), typed-extension model (typed hooks + deterministic composition +
feature-control activation), T2 featureset/conflict-mediation realized, engine-as-shard,
agent-first API surface, module sketch, reuse (consumes feature-control/authorization;
G1 framework proposal), traceability, decisions/open, stability note. Marks T5 done.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-15 22:54:40 +02:00

17 KiB
Raw Blame History

WikiEngineCoreArchitecture

Status: draft for review · Date: 2026-06-15 · Deliverable of SHARD-WP-0013 T5

The architecture of shard-wiki's native reference wiki-engine: a headless, API-first engine — a small core plus a stringent typed-extension framework — that addresses the whole use-case catalogue, mediates conflicting requirements into one integrated featureset, and lets each shard activate only what it needs. Authoritative as of the ratified INTENT amendment (2026-06-15, decision 84ffdb48): the engine is additive and is shard-wiki's reference first-party shard backend (a canonical-mode shard) — not a replacement for other engines, not a UI.

Relation to other specs (referenced, not restated):

  • CoreArchitectureBlueprint.md — the orchestrator/whole-system architecture. The engine is one shard behind §A; federation, union, projection, and cross-shard coordination are the orchestrator's job, not the engine's. That is what keeps the engine small.
  • TechnicalSpecificationDocument.md §A — the shard adapter contract the engine implements.
  • FederationRequirements.md — page resolution, overlay, link semantics (ADRs the engine reuses).
  • UseCaseCatalog.md "Capability structure" layer (T2) — the core-vs-extension map + the conflict-mediation map this document realizes.
  • reuse surface (capability.wiki.*, plus consumed feature-control / authorization).

1. Thesis: a small page-store kernel; everything else is a typed extension

The engine is a page-store kernel with a typed-extension runtime. Every capability beyond the c2-minimum is a typed extension a shard activates only if it needs it — and a shard's externally-visible capability profile is computed from its active extension set.

That single chain — configuration (which extensions) → capability (what the shard can do) → conformance (verified) — is the whole design. It mirrors the orchestrator's discipline (CoreArchitectureBlueprint §6.5: capability-as-data, verified, no per-backend code) and turns "integrated whole, yet activate only what you need" from a slogan into a mechanism.

The engine stays small for a structural reason: it is one shard, not a federation layer. Union, projection, equivalence, cross-shard overlay-orchestration, and the federation models all live in shard-wiki's orchestrator (the blueprint). The engine implements ShardAdapter (§A) and nothing above it. So "wiki engine" here means a really good single canonical shard with a typed-extension framework and a headless agent-first API — not a re-implementation of shard-wiki.


2. Engine invariants

# Invariant Why
E-1 One shard, not a federation layer. The engine implements ShardAdapter (§A); union/projection/federation are the orchestrator's. Keeps the engine small; no duplication of the blueprint.
E-2 Small kernel. The kernel is only: page store + history, the page model (reused), the extension runtime, the API. Common case (a plain wiki) is trivial.
E-3 Everything else is a typed extension. No feature beyond the c2-minimum is baked into the kernel. Integrated-whole-yet-selective; testable boundary.
E-4 Per-shard activation. A shard runs an activation profile (a set of extensions + config); unused features cost nothing. "Activate only what you need."
E-5 Capability profile is derived from active extensions. The §A profile the engine declares is computed from its activation profile, then conformance-verified. One source of truth; honest, verified capabilities.
E-6 Headless & API-first. The API is the only interface; no bundled UI/rendering (consumer concern, L6). INTENT amendment; clean orchestrator/consumer split.
E-7 Agent-first ergonomics. The API is typed, introspectable, batchable, low-round-trip. INTENT: optimized for efficient agent/automation access.
E-8 Reuse over reinvent. Page model, history/journal, activation, and authz are consumed (existing capabilities), not rebuilt. Smallness; reuse-surface alignment.
E-9 Extensions are typed & verified. An extension declares its types/hooks/deps; activation is rejected if types conflict or deps are unmet (impossible profiles forbidden). Stringency; mirrors §6.5 + conformance.

3. The kernel (four concepts)

The kernel is deliberately four things — nothing more is mandatory.

  1. Page — the backend-neutral page model (capability.wiki.page-model, reused as-is): stable identity ≠ placement, layered provenance, page shapes. The kernel does not redefine it; extensions may register additional shapes/types (§4).
  2. Store + history — a git-backed page store (the engine is the git-IS-store case from the blueprint): a write is a commit; history is native and recoverable (E-3/I-10). Coordination decisions reuse the event-sourced journal (capability.wiki.coordination-journal).
  3. Extension runtime — the typed-extension registry, hook dispatcher, type checker, and activation engine (§4). This is the core innovation; it is the only “framework” in the kernel.
  4. API — the headless, typed, agent-first surface (§7). Kernel endpoints cover the c2-minimum (page CRUD-as-history, links, history); extensions extend the surface through typed routes.

The c2-minimum a kernel-only shard delivers (no extensions): write a page, link pages ([[wikilink]] + red-link), never lose an edit. That is a complete, useful headless wiki.


4. The typed-extension model (the framework)

An Extension is a typed unit declaring a contract the runtime enforces:

Extension:
  id            : reverse-domain id (e.g. ext.struct.typed-records)
  provides      : capability ids it realizes (reuse-surface; e.g. capability.wiki.page-model[typed])
  types         : page shapes / field schemas / content-types it introduces (typed, validated)
  hooks         : kernel lifecycle bindings it implements (see below)
  api           : typed routes it adds to the headless surface
  depends_on    : other extensions / consumed capabilities required
  conflicts_with: extensions it cannot co-activate with
  config        : declared, schema-checked activation parameters

Hooks (the kernel lifecycle the runtime dispatches): on_resolve (name→page), on_read, on_write (validate/transform a draft), on_link (link/transclusion resolution), on_history, on_query, on_render_request (produce a derived representation for a consumer), on_profile (contribute capability-spectrum positions, E-5). Hooks are typed (typed inputs/outputs) and dispatched in a declared, deterministic order.

Typing & composition (stringency):

  • At activation, the runtime builds the dependency closure, checks type consistency (no two active extensions claim incompatible types for the same page shape/field; conflicts_with honoured), and rejects an impossible profile — exactly the §6.5 implication-rule discipline, applied to extensions. A rejected profile fails fast at boot, never silently.
  • Composition is deterministic: hook order is declared; conflicts are resolved by explicit precedence or rejection, never by accident.
  • Extensions ship a conformance check (mirrors §6.6): an activated extension is exercised against its declared types/hooks before the shard serves traffic — typed contracts verified, not trusted.

Per-shard activation (reuse, not reinvent):

  • A shard's activation profile = {extension id → config}. Activation/evaluation reuses capability.feature-control.evaluate (helix_forge/feature-control) — shard-wiki does not build a bespoke flagging system (T3 consumption).
  • E-5 in action: the engine's on_profile hooks fold the active extensions into the §A capability profile the shard advertises to the orchestrator (e.g. activate ext.struct.typed-records → the structure spectrum rises and structured-payload is declared). The profile is then conformance-verified (§A.2). Configuration → capability → conformance is one chain.

5. Featureset map: core vs extensions, and conflict mediation

The engine realizes the T2 "Capability structure" layer (UseCaseCatalog.md). Mapping (the page/content-level clusters; X-FED and X-ATT are orchestrator concerns, not engine extensions — E-1):

Engine kernel (always on) T2 reuse-surface
Page lifecycle, identity/placement, history, links, store EC-1…EC-5 capability.wiki.page-model, …coordination-journal, …adapter-contract
Built-in typed extension T2 cluster provides / consumes default
ext.overlay X-OVERLAY capability.wiki.overlay on (no-op locally)
ext.authz (L0→L4 tiers) X-AUTHZ consumes capability.authorization.policy-evaluate L0
ext.views (BackLinks/RecentChanges/…) X-VIEW capability.wiki.derived-views BackLinks/RecentChanges on
ext.struct (typed/computed/graph) X-STRUCT capability.wiki.page-model[typed] off
ext.addr (span addr / transclusion / query) X-ADDR capability.wiki.page-model+query transclusion on
ext.compute (literate/notebook/program/live) X-COMP capability.wiki.engine-typed-extensions off (gated, sandbox)
ext.prov (rich provenance/metadata) X-PROV capability.wiki.page-model[provenance] base on
ext.collab (c2 social patterns) X-COLLAB (UI/convention; mostly consumer) off

Conflict mediation (T2 map) realized by the framework — every tension is a mechanism, not a baked-in choice, so one featureset serves all:

Tension Realized by
open vs governed ext.authz tiers (additive); kernel history is the floor at L0
lossless vs lossy a translate hook + fidelity report (consumes the proposed capability.content.translation-fidelity, G2)
live vs snapshot ext.compute/ext.addr mark liveness; degrade to snapshot (never imply live)
canonical vs chorus detection in kernel; resolution is a policy preset (orchestrator)
integrated-whole vs only-what-you-need the activation profile (E-4) + typed composition (§4) — the headline mediation
minimal vs feature-rich small kernel (§3) + extensions; nothing beyond c2 is mandatory

6. The engine as a canonical-mode shard

The engine exposes itself through an EngineShardAdapter implementing §A:

  • Substrate git-IS-store; history git-native; write = commit; current_rev = sha (apply-under-drift works out of the box). It is the most capable shard shard-wiki can attach — it dogfoods the contract.
  • Its capability profile is computed from active extensions (E-5) and conformance-verified (§A.2) — so the orchestrator sees an honest profile, and federation ops degrade by the engine's actually-activated capabilities.
  • The orchestrator attaches it like any shard; federation/union/projection are not in the engine (E-1). A standalone deployment is "the engine as the sole canonical shard"; a federated deployment is "the engine as one shard among many." Same engine, no re-architecture.

This is the precise realization of the INTENT reconciliation: shard-wiki orchestrates; the engine is the first-party shard it can attach.


7. Headless API surface & agent ergonomics (E-6/E-7)

API-first means the typed API is the product; there is no UI. Agent-first means it is designed for cheap, deterministic machine consumption:

  • Typed resource API over pages, links, history, spans — content-negotiated (raw Markdown, the structured page model, or an extension-rendered representation via on_render_request).
  • Capability/extension introspection — an endpoint returns the shard's active extensions, their types, and the derived §A capability profile, so an agent can discover what this shard can do before acting (no trial-and-error). This is the agent-facing twin of E-5.
  • Batch & query — multi-page reads, link-graph and RecentChanges queries (via ext.views), and on_query delegation — minimizing round-trips.
  • Write via overlay — edits go through the overlay path (FederationRequirements ADR-05), so agent writes are safe (draft → apply-under-drift) and attributable.
  • Deterministic & provenance-carrying — every response carries the provenance envelope; identical inputs yield identical outputs (no hidden state) — friendly to caching agents.

8. Implementation sketch (module layout)

The engine lives under the shard-wiki package as a backend (it sits at L0/L1 — a shard behind the adapter; nothing in the orchestrator depends up on it):

src/shard_wiki/engine/
  kernel.py        # page store + history (git-IS-store), lifecycle; reuses model/, provenance/, coordination/
  extension.py     # Extension contract, registry, typed hook dispatcher, type checker
  activation.py    # activation profile; reuses capability.feature-control.evaluate
  profile.py       # derive the §A CapabilityProfile from active extensions (E-5) + conformance
  api.py           # headless, typed, agent-first surface (+ extension introspection)
  adapter.py       # EngineShardAdapter implements adapters/ ShardAdapter (canonical-mode shard)
  extensions/      # built-ins: overlay/ authz/ views/ struct/ addr/ compute/ prov/ collab/

Dependency rule: engine/ consumes model/, provenance/, coordination/, adapters/ (contract), policy/; it is consumed only via its EngineShardAdapter (the orchestrator attaches it as a shard). No orchestrator-tier (union/, projection/) import.


9. Reuse (what the engine consumes vs registers)

  • Consumes: capability.feature-control.evaluate (activation), capability.authorization. policy-evaluate (ext.authz), the proposed capability.content.translation-fidelity (G2, lossy translation), and shard-wiki's own capability.wiki.{page-model, coordination-journal, adapter-contract, overlay, derived-views}.
  • Registers / realizes: capability.wiki.engine-typed-extensions (this document is its Discovery evidence — D2→D3 on ratification). The cross-cutting typed-extension framework pattern is proposed back to the reuse surface as G1 (capability.platform.typed-extension- framework); this engine is its first instance.

10. Traceability

  • INTENT — realizes the 2026-06-15 amendment (decision 84ffdb48): headless, API-first, additive native engine = canonical-mode shard backend; honours all engine invariants and the orchestrator boundary (E-1).
  • Use cases — the kernel/extension split is the T2 "Capability structure" layer (UseCaseCatalog.md); every UC is either kernel (EC-1…EC-5) or a named extension; conflicts use the T2 mediation map (§5). The engine must ultimately cover UC-01UC-84 (per-shard subsets).
  • Architecture — consistent with CoreArchitectureBlueprint (engine = canonical-mode shard, §6 contract, §7 page model, §8.1 journal) and TechnicalSpecificationDocument §A (the contract it implements). FederationRequirements ADR-05/06 supply overlay + link semantics.
  • Reuse surface — §9; G1/G2 proposals from SHARD-WP-0013 T3.

11. Decisions / deferred / open

Decided: small page-store kernel + typed-extension runtime (E-2/E-3); engine is one shard, not a federation layer (E-1); capability profile derived from active extensions (E-5); headless, API-first, agent-first (E-6/E-7); activation reuses feature-control (E-8); extensions are typed + conformance-verified (E-9).

Deferred: the concrete extension SDK/ABI and hook signatures; the API protocol (REST/GraphQL/ MCP) — agent-first introspection is required, the wire format is an implementation spike; the built-in extensions' internal designs (each is a later workplan).

Open (tracked): does ext.compute ever execute in-process or strictly delegate/snapshot (ties blueprint §8.5 + trust/sandbox); is the typed-extension framework promoted to the reuse-surface platform capability (G1) and then consumed here rather than engine-owned; introspection granularity vs. leaking internal structure to agents.

12. Stability note

The thesis (§1) and invariants (§2) — especially engine-is-one-shard (E-1), small-kernel/everything-else-typed-extension (E-2/E-3), and capability-profile-derived-from- extensions (E-5) — are load-bearing. Changing them (e.g. moving federation into the engine, or baking a feature into the kernel) is an architectural change in the sense of INTENT's Stability Note and should be rare and deliberate. The headless/API-first posture is fixed by the ratified INTENT amendment.