Files
shard-wiki/spec/WikiEngineCoreArchitecture.md
tegwick 0ee972f2e2 spec(SHARD-WP-0013 T5): WikiEngineCoreArchitecture.md — small core + typed extensions
Headless, API-first, agent-optimized native engine = canonical-mode shard backend.
Thesis: a page-store kernel with a typed-extension runtime; everything beyond the
c2-minimum is a typed extension activated per shard, and the shard's §A capability
profile is DERIVED from its active extensions (configuration->capability->conformance).
9 engine invariants (engine-is-one-shard, small kernel, per-shard activation,
profile-from-extensions, headless/agent-first, reuse-not-reinvent, typed+verified).
Kernel (4 concepts), typed-extension model (typed hooks + deterministic composition +
feature-control activation), T2 featureset/conflict-mediation realized, engine-as-shard,
agent-first API surface, module sketch, reuse (consumes feature-control/authorization;
G1 framework proposal), traceability, decisions/open, stability note. Marks T5 done.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-15 22:54:40 +02:00

270 lines
17 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# WikiEngineCoreArchitecture
Status: **draft for review** · Date: 2026-06-15 · Deliverable of **SHARD-WP-0013 T5**
The architecture of shard-wiki's **native reference wiki-engine**: a **headless, API-first**
engine — a **small core** plus a **stringent typed-extension framework** — that addresses the
whole use-case catalogue, mediates conflicting requirements into one integrated featureset, and
lets each shard **activate only what it needs**. Authoritative as of the ratified INTENT
amendment (2026-06-15, decision `84ffdb48`): the engine is **additive** and is shard-wiki's
**reference first-party shard backend (a canonical-mode shard)** — not a replacement for other
engines, not a UI.
Relation to other specs (referenced, not restated):
- `CoreArchitectureBlueprint.md` — the orchestrator/whole-system architecture. **The engine is
one shard behind §A; federation, union, projection, and cross-shard coordination are the
orchestrator's job, not the engine's.** That is what keeps the engine small.
- `TechnicalSpecificationDocument.md §A` — the shard adapter contract the engine implements.
- `FederationRequirements.md` — page resolution, overlay, link semantics (ADRs the engine reuses).
- `UseCaseCatalog.md` "Capability structure" layer (T2) — the core-vs-extension map + the
conflict-mediation map this document realizes.
- reuse surface (`capability.wiki.*`, plus consumed `feature-control` / `authorization`).
---
## 1. Thesis: a small page-store kernel; everything else is a typed extension
> **The engine is a page-store kernel with a typed-extension runtime. Every capability beyond
> the c2-minimum is a *typed extension* a shard activates only if it needs it — and a shard's
> externally-visible capability profile is *computed from its active extension set*.**
That single chain — **configuration (which extensions) → capability (what the shard can do) →
conformance (verified)** — is the whole design. It mirrors the orchestrator's discipline
(`CoreArchitectureBlueprint` §6.5: capability-as-data, verified, no per-backend code) and turns
"integrated whole, yet activate only what you need" from a slogan into a mechanism.
The engine stays small for a structural reason: it is **one shard**, not a federation layer.
Union, projection, equivalence, cross-shard overlay-orchestration, and the federation models all
live in shard-wiki's orchestrator (the blueprint). The engine implements `ShardAdapter` (§A) and
nothing above it. So "wiki engine" here means *a really good single canonical shard with a
typed-extension framework and a headless agent-first API* — not a re-implementation of shard-wiki.
---
## 2. Engine invariants
| # | Invariant | Why |
|---|-----------|-----|
| E-1 | **One shard, not a federation layer.** The engine implements `ShardAdapter` (§A); union/projection/federation are the orchestrator's. | Keeps the engine small; no duplication of the blueprint. |
| E-2 | **Small kernel.** The kernel is only: page store + history, the page model (reused), the extension runtime, the API. | Common case (a plain wiki) is trivial. |
| E-3 | **Everything else is a typed extension.** No feature beyond the c2-minimum is baked into the kernel. | Integrated-whole-yet-selective; testable boundary. |
| E-4 | **Per-shard activation.** A shard runs an *activation profile* (a set of extensions + config); unused features cost nothing. | "Activate only what you need." |
| E-5 | **Capability profile is derived from active extensions.** The §A profile the engine declares is computed from its activation profile, then conformance-verified. | One source of truth; honest, verified capabilities. |
| E-6 | **Headless & API-first.** The API is the only interface; no bundled UI/rendering (consumer concern, L6). | INTENT amendment; clean orchestrator/consumer split. |
| E-7 | **Agent-first ergonomics.** The API is typed, introspectable, batchable, low-round-trip. | INTENT: optimized for efficient agent/automation access. |
| E-8 | **Reuse over reinvent.** Page model, history/journal, activation, and authz are *consumed* (existing capabilities), not rebuilt. | Smallness; reuse-surface alignment. |
| E-9 | **Extensions are typed & verified.** An extension declares its types/hooks/deps; activation is rejected if types conflict or deps are unmet (impossible profiles forbidden). | Stringency; mirrors §6.5 + conformance. |
---
## 3. The kernel (four concepts)
The kernel is deliberately four things — nothing more is mandatory.
1. **Page** — the backend-neutral page model (`capability.wiki.page-model`, reused as-is):
stable identity ≠ placement, layered provenance, page shapes. The kernel does **not** redefine
it; extensions may *register additional shapes/types* (§4).
2. **Store + history** — a git-backed page store (the engine is the *git-IS-store* case from the
blueprint): a write is a commit; history is native and recoverable (E-3/I-10). Coordination
decisions reuse the event-sourced journal (`capability.wiki.coordination-journal`).
3. **Extension runtime** — the typed-extension registry, hook dispatcher, type checker, and
activation engine (§4). *This is the core innovation; it is the only “framework” in the kernel.*
4. **API** — the headless, typed, agent-first surface (§7). Kernel endpoints cover the c2-minimum
(page CRUD-as-history, links, history); extensions extend the surface through typed routes.
The **c2-minimum** a kernel-only shard delivers (no extensions): write a page, link pages
(`[[wikilink]]` + red-link), never lose an edit. That is a complete, useful headless wiki.
---
## 4. The typed-extension model (the framework)
An **Extension** is a typed unit declaring a contract the runtime enforces:
```
Extension:
id : reverse-domain id (e.g. ext.struct.typed-records)
provides : capability ids it realizes (reuse-surface; e.g. capability.wiki.page-model[typed])
types : page shapes / field schemas / content-types it introduces (typed, validated)
hooks : kernel lifecycle bindings it implements (see below)
api : typed routes it adds to the headless surface
depends_on : other extensions / consumed capabilities required
conflicts_with: extensions it cannot co-activate with
config : declared, schema-checked activation parameters
```
**Hooks (the kernel lifecycle the runtime dispatches):**
`on_resolve` (name→page), `on_read`, `on_write` (validate/transform a draft), `on_link`
(link/transclusion resolution), `on_history`, `on_query`, `on_render_request` (produce a derived
representation for a consumer), `on_profile` (contribute capability-spectrum positions, E-5).
Hooks are **typed** (typed inputs/outputs) and dispatched in a **declared, deterministic order**.
**Typing & composition (stringency):**
- At activation, the runtime builds the **dependency closure**, checks **type consistency** (no
two active extensions claim incompatible types for the same page shape/field; `conflicts_with`
honoured), and rejects an **impossible profile** — exactly the §6.5 implication-rule discipline,
applied to extensions. A rejected profile fails fast at boot, never silently.
- Composition is **deterministic**: hook order is declared; conflicts are resolved by explicit
precedence or rejection, never by accident.
- Extensions ship a **conformance check** (mirrors §6.6): an activated extension is exercised
against its declared types/hooks before the shard serves traffic — *typed contracts verified,
not trusted*.
**Per-shard activation (reuse, not reinvent):**
- A shard's **activation profile** = `{extension id → config}`. Activation/evaluation **reuses
`capability.feature-control.evaluate`** (helix_forge/feature-control) — shard-wiki does not
build a bespoke flagging system (T3 consumption).
- **E-5 in action:** the engine's `on_profile` hooks fold the active extensions into the §A
**capability profile** the shard advertises to the orchestrator (e.g. activate
`ext.struct.typed-records` → the `structure` spectrum rises and `structured-payload` is
declared). The profile is then conformance-verified (§A.2). *Configuration → capability →
conformance is one chain.*
---
## 5. Featureset map: core vs extensions, and conflict mediation
The engine realizes the T2 "Capability structure" layer (`UseCaseCatalog.md`). Mapping (the
*page/content-level* clusters; **X-FED and X-ATT are orchestrator concerns, not engine
extensions** — E-1):
| Engine kernel (always on) | T2 | reuse-surface |
|---------------------------|----|---------------|
| Page lifecycle, identity/placement, history, links, store | EC-1…EC-5 | `capability.wiki.page-model`, `…coordination-journal`, `…adapter-contract` |
| Built-in typed extension | T2 cluster | provides / consumes | default |
|--------------------------|-----------|---------------------|---------|
| `ext.overlay` | X-OVERLAY | `capability.wiki.overlay` | on (no-op locally) |
| `ext.authz` (L0→L4 tiers) | X-AUTHZ | consumes `capability.authorization.policy-evaluate` | L0 |
| `ext.views` (BackLinks/RecentChanges/…) | X-VIEW | `capability.wiki.derived-views` | BackLinks/RecentChanges on |
| `ext.struct` (typed/computed/graph) | X-STRUCT | `capability.wiki.page-model[typed]` | off |
| `ext.addr` (span addr / transclusion / query) | X-ADDR | `capability.wiki.page-model`+query | transclusion on |
| `ext.compute` (literate/notebook/program/live) | X-COMP | `capability.wiki.engine-typed-extensions` | off (gated, sandbox) |
| `ext.prov` (rich provenance/metadata) | X-PROV | `capability.wiki.page-model[provenance]` | base on |
| `ext.collab` (c2 social patterns) | X-COLLAB | (UI/convention; mostly consumer) | off |
**Conflict mediation (T2 map) realized by the framework** — every tension is a *mechanism*, not a
baked-in choice, so one featureset serves all:
| Tension | Realized by |
|---------|-------------|
| open vs governed | `ext.authz` tiers (additive); kernel history is the floor at L0 |
| lossless vs lossy | a `translate` hook + fidelity report (consumes the proposed `capability.content.translation-fidelity`, G2) |
| live vs snapshot | `ext.compute`/`ext.addr` mark liveness; degrade to snapshot (never imply live) |
| canonical vs chorus | detection in kernel; resolution is a policy preset (orchestrator) |
| integrated-whole vs only-what-you-need | **the activation profile** (E-4) + typed composition (§4) — the headline mediation |
| minimal vs feature-rich | small kernel (§3) + extensions; nothing beyond c2 is mandatory |
---
## 6. The engine as a canonical-mode shard
The engine exposes itself through an `EngineShardAdapter` implementing §A:
- **Substrate** git-IS-store; **history** git-native; **write** = commit; `current_rev` = sha
(apply-under-drift works out of the box). It is the **most capable shard** shard-wiki can
attach — it dogfoods the contract.
- Its **capability profile is computed from active extensions** (E-5) and **conformance-verified**
(§A.2) — so the orchestrator sees an honest profile, and federation ops degrade by the engine's
*actually-activated* capabilities.
- The orchestrator attaches it like any shard; **federation/union/projection are not in the
engine** (E-1). A standalone deployment is "the engine as the sole canonical shard"; a
federated deployment is "the engine as one shard among many." Same engine, no re-architecture.
This is the precise realization of the INTENT reconciliation: shard-wiki orchestrates; the engine
is the first-party shard it can attach.
---
## 7. Headless API surface & agent ergonomics (E-6/E-7)
API-first means the typed API is the product; there is no UI. Agent-first means it is designed
for cheap, deterministic machine consumption:
- **Typed resource API** over pages, links, history, spans — content-negotiated (raw Markdown,
the structured page model, or an extension-rendered representation via `on_render_request`).
- **Capability/extension introspection** — an endpoint returns the shard's **active extensions,
their types, and the derived §A capability profile**, so an agent can discover *what this shard
can do* before acting (no trial-and-error). This is the agent-facing twin of E-5.
- **Batch & query** — multi-page reads, link-graph and RecentChanges queries (via `ext.views`),
and `on_query` delegation — minimizing round-trips.
- **Write via overlay** — edits go through the overlay path (FederationRequirements ADR-05), so
agent writes are safe (draft → apply-under-drift) and attributable.
- **Deterministic & provenance-carrying** — every response carries the provenance envelope;
identical inputs yield identical outputs (no hidden state) — friendly to caching agents.
---
## 8. Implementation sketch (module layout)
The engine lives under the shard-wiki package as a backend (it sits at L0/L1 — a shard behind the
adapter; nothing in the orchestrator depends *up* on it):
```
src/shard_wiki/engine/
kernel.py # page store + history (git-IS-store), lifecycle; reuses model/, provenance/, coordination/
extension.py # Extension contract, registry, typed hook dispatcher, type checker
activation.py # activation profile; reuses capability.feature-control.evaluate
profile.py # derive the §A CapabilityProfile from active extensions (E-5) + conformance
api.py # headless, typed, agent-first surface (+ extension introspection)
adapter.py # EngineShardAdapter implements adapters/ ShardAdapter (canonical-mode shard)
extensions/ # built-ins: overlay/ authz/ views/ struct/ addr/ compute/ prov/ collab/
```
Dependency rule: `engine/` consumes `model/`, `provenance/`, `coordination/`, `adapters/`
(contract), `policy/`; it is consumed *only* via its `EngineShardAdapter` (the orchestrator
attaches it as a shard). No orchestrator-tier (`union/`, `projection/`) import.
---
## 9. Reuse (what the engine consumes vs registers)
- **Consumes:** `capability.feature-control.evaluate` (activation), `capability.authorization.
policy-evaluate` (`ext.authz`), the proposed `capability.content.translation-fidelity` (G2,
lossy translation), and shard-wiki's own `capability.wiki.{page-model, coordination-journal,
adapter-contract, overlay, derived-views}`.
- **Registers / realizes:** `capability.wiki.engine-typed-extensions` (this document is its
Discovery evidence — D2→D3 on ratification). The cross-cutting **typed-extension framework**
pattern is proposed back to the reuse surface as **G1** (`capability.platform.typed-extension-
framework`); this engine is its first instance.
---
## 10. Traceability
- **INTENT** — realizes the 2026-06-15 amendment (decision `84ffdb48`): headless, API-first,
additive native engine = canonical-mode shard backend; honours all engine invariants and the
orchestrator boundary (E-1).
- **Use cases** — the kernel/extension split *is* the T2 "Capability structure" layer
(`UseCaseCatalog.md`); every UC is either kernel (EC-1…EC-5) or a named extension; conflicts
use the T2 mediation map (§5). The engine must ultimately cover UC-01UC-84 (per-shard subsets).
- **Architecture** — consistent with `CoreArchitectureBlueprint` (engine = canonical-mode shard,
§6 contract, §7 page model, §8.1 journal) and `TechnicalSpecificationDocument §A` (the contract
it implements). `FederationRequirements` ADR-05/06 supply overlay + link semantics.
- **Reuse surface** — §9; G1/G2 proposals from SHARD-WP-0013 T3.
## 11. Decisions / deferred / open
**Decided:** small page-store kernel + typed-extension runtime (E-2/E-3); engine is one shard,
not a federation layer (E-1); capability profile derived from active extensions (E-5); headless,
API-first, agent-first (E-6/E-7); activation reuses `feature-control` (E-8); extensions are
typed + conformance-verified (E-9).
**Deferred:** the concrete extension SDK/ABI and hook signatures; the API protocol (REST/GraphQL/
MCP) — agent-first introspection is required, the wire format is an implementation spike; the
built-in extensions' internal designs (each is a later workplan).
**Open (tracked):** does `ext.compute` ever execute in-process or strictly delegate/snapshot
(ties blueprint §8.5 + trust/sandbox); is the typed-extension framework promoted to the
reuse-surface platform capability (G1) and then *consumed* here rather than engine-owned;
introspection granularity vs. leaking internal structure to agents.
## 12. Stability note
The **thesis (§1)** and **invariants (§2)** — especially *engine-is-one-shard* (E-1),
*small-kernel/everything-else-typed-extension* (E-2/E-3), and *capability-profile-derived-from-
extensions* (E-5) — are load-bearing. Changing them (e.g. moving federation into the engine, or
baking a feature into the kernel) is an architectural change in the sense of INTENT's Stability
Note and should be rare and deliberate. The headless/API-first posture is fixed by the ratified
INTENT amendment.