Compare commits

...

15 Commits

Author SHA1 Message Date
776f5af5a7 Normalize agent instructions and workplan frontmatter (STATE-WP-0067)
- Align agent files with on-disk workplan prefixes (infer from workplan ids)
- Set workplan domain to registered domain_slug; add topic_slug where applicable
- Repair frontmatter delimiter formatting; migrate legacy task status literals
- Regenerate AGENTS.md, CLAUDE.md, and .claude/rules from State Hub templates
2026-06-22 23:16:28 +02:00
fd961c83b4 Add .repo-classification.yaml (CUST-WP-0050 T11 agent first-pass) 2026-06-22 17:47:43 +02:00
cca5bf83c3 Add credential routing instructions for all agent runtimes
Propagate shared credential-routing section (Codex, Claude, Grok, llm-connect)
from state-hub template via scripts/propagate_credential_routing.py.
2026-06-18 22:48:39 +02:00
def699c1eb feat(adapters): GitShardAdapter history adopt + cross-substrate integration (WP-0012 T3)
Adopt git-native history (TSD §A.5): a VERSION-gated history(key) surfaces the
commit list for a path (newest-first sha + subject) — declared by every git-IS-store
shard, read-only or not. Integration proves the union/overlay/edit machinery works
unchanged across folder + git substrates: resolve/chorus span both, edit through a
git shard fast-forwards as a commit, apply-under-drift refuses on an external commit
(sha drift) without clobbering, and a read-only git target keeps the overlay as a
draft. SCOPE updated; WP-0012 done. 196 tests green.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-16 02:41:19 +02:00
a4e0f52ec1 feat(adapters): GitShardAdapter write=commit + current_rev drift (WP-0012 T2)
Writable mode: write(key, body) stages and commits the file (skipping a no-op so
no empty commit is created), returning the page at the new commit sha. The
writable profile declares WRITE + VERSION with PER_PAGE granularity. current_rev
is the per-path commit sha, so a write — or an external commit to the same path —
moves it, driving apply-under-drift. Passes the conformance positive-write probe.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-16 02:38:41 +02:00
4231daf94f feat(adapters): GitShardAdapter read path + git-IS-store profile (WP-0012 T1)
A second substrate validating the contract beyond plain folders: a git-IS-store
shard reading Markdown from a git repo. Keys are tracked *.md paths; read returns
a Page whose source_rev is the per-path last-commit sha (so an edit to one page
never drifts another); profile is git-IS-store / substrate=git / history=git-native
/ addressing=path, validated against the §6.5 implication rules. Passes the
conformance read path with honest absence of unclaimed verbs. Zero new deps
(git CLI via subprocess). No core changes.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-16 02:36:28 +02:00
37681d89b6 feat(incremental): wire maintained tier behind views; rebuild fallback (WP-0011 T4)
Route InformationSpace.all_pages through a maintained UnionIndex: equivalence is
served from the incrementally maintained index (curator bindings re-synced live
from the log fold + detected content edges), exposed in decision-log string form
so results are a behaviour-preserving superset. The index is built lazily and
rebuilt (bounded fallback) when the union mutates (attach/edit invalidate it);
reindex() forces a rebuild and verify_index() runs the I-2 self-healing checker.
all_pages() gains an optional equivalence_groups source (default = fold) so
direct callers are unaffected. SCOPE updated; WP-0011 done. 173 tests green.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-16 02:21:39 +02:00
a8e65235a8 feat(incremental): I-2 digest + consistency-checker (WP-0011 T3)
A Merkle-style digest summarizes the derived tier (per-identity fingerprint +
incident edges as order-independent leaves) so equal states have equal digests
and the digest is stable under equivalent event orders. A ConsistencyChecker
recomputes the authoritative fold from the current source, compares it over a
sampled region, and on mismatch scoped-recomputes just the affected identities —
self-healing missed-delta drift, corrupted internal state, and vanished pages.
Makes derived = f(canonical) verified, not asserted.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-16 02:16:50 +02:00
d7d046cac0 test(incremental): delta maintenance == rebuild, retraction + split (WP-0011 T2)
Verify change-driven maintenance keeps the equivalence index equal to a
from-scratch rebuild under add / edit / remove: an edit into a new bucket
retracts the stale edge, an edit into equivalence adds one, and removing a
connector node propagates a retraction that splits a chorus. Equality checked
against a fresh build() oracle on every operation.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-16 02:14:32 +02:00
0b3ab2086f feat(incremental): indexed equivalence — blocking + verify (WP-0011 T1)
Detect equivalence (distinct identities holding the same page) without pairwise
O(N²): MinHash/LSH bands over content shingles + normalized-title buckets
generate candidates (blocking), then exact-fingerprint or Jaccard>=threshold
confirm them (verify), with curator decision-log bindings always forming edges.
Groups are the connected components of the edge set. Includes the incremental
add/update/remove internals used by T2. Matches a brute-force oracle. New
incremental/ package (minhash primitives + EquivalenceIndex).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-16 02:13:06 +02:00
d85d019543 feat(views): wire derived views onto InformationSpace + integration (WP-0010 T5)
Expose backlinks(name), recent_changes(), all_pages(), site_map() on
InformationSpace. Integration test exercises all four over two shards (BackLinks
aggregate across shards, AllPages/SiteMap span the union, RecentChanges merges an
alias decision with shard edits). SCOPE updated; WP-0010 done. 152 tests green.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-16 02:05:12 +02:00
3a5acdcb28 feat(views): AllPages + SiteMap enumeration views (WP-0010 T4)
AllPages enumerates the union's distinct pages, collapsing chorus (same key
across shards) and equivalence-bound identities into one entry via union-find,
noting divergence when members' bodies differ (collapse acknowledged, not
silent). SiteMap builds the namespace tree from page placements, spanning shards.
Both derived/recomputable and presentation-free.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-16 02:03:15 +02:00
34b0c539f3 feat(views): RecentChanges merged change feed (WP-0010 T3)
One newest-first feed merging the coordination journal (overlay/alias/fork/merge/
binding decisions, with actor + payload) and shard change signals (page
source_rev / mtime). Each entry carries provenance: the originating shard for an
edit, or 'coordination' (and the actor) for a decision. Non-temporal revision
tokens are skipped gracefully. Derived/recomputable; notify-streaming later.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-16 01:59:11 +02:00
da540d4eea feat(views): BackLinks derived view over the union link graph (WP-0010 T2)
For any page name, the set of pages that link to it: extract wikilinks from every
union page (new UnionGraph.iter_pages enumeration) and index the resolved ones by
target name. Red-links create no backlinks; entries carry source provenance; a
chorus target aggregates the backlinks of all members under one name. Derived/
recomputable, stores nothing canonical.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-16 01:56:48 +02:00
951b24300d feat(views): wikilink + red-link model (WP-0010 T1)
A CommonMark wikilink extension: extract [[Target]] / [[Target|label]] from a
page body (skipping fenced + inline code, preserving offsets), and resolve each
target through the union — resolved is a link, unresolved is a createable
red-link (never a dropped reference). CamelCase auto-linking is off by default,
opt-in per space, and never double-counts a target already inside [[...]]. Link
model + resolution are core; rendering stays L6. New views/ package.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-16 01:55:06 +02:00
53 changed files with 2744 additions and 126 deletions

20
.claude/rules/agents.md Normal file
View File

@@ -0,0 +1,20 @@
## Kaizen Agents
Specialized agent personas available on demand via the state-hub MCP.
**Discover:** `list_kaizen_agents()` — returns all agents with name, description, category
**Load:** `get_kaizen_agent("tdd-workflow")` — returns full instructions; read and follow them
Common agents:
| Agent | Category | When to use |
|-------|----------|-------------|
| `tdd-workflow` | testing | Step-by-step TDD8 workflow for any feature |
| `code-refactoring` | quality | Code quality analysis and safe refactoring |
| `test-maintenance` | testing | Diagnose and fix failing tests |
| `requirements-engineering` | process | Prevent interface/mock mismatches upfront |
| `keepaTodofile` | process | Maintain TODO.md during work |
| `project-management` | process | Track status, determine next steps |
| `datamodel-optimization` | quality | Optimize dataclasses and data structures |
All 17 agents: call `list_kaizen_agents()` for the full list.

View File

@@ -0,0 +1,8 @@
## Architecture
<!-- TODO: Describe the key design decisions and component structure.
Key modules, data flows, external integrations, state machines, etc. -->
## Quick Reference
`~/state-hub/mcp_server/TOOLS.md` — MCP tool reference

View File

@@ -0,0 +1,50 @@
# Credential and access routing
**Audience:** Codex, Claude Code, Grok, and custodian agents that call **llm-connect**
for inference. Run this check **before** requesting secrets, API keys, SSH access,
login tokens, or database passwords — in any repo, not only `ops-warden`.
ops-warden **issues SSH certificates only** (`warden sign`, `cert_command`). Every
other credential need belongs to another subsystem. **Do not** message
`ops-warden` on State Hub expecting a secret value; the reply is a pointer, not a key.
### Lookup (do this first)
```bash
warden route find "<describe your need>" --json
warden route show <catalog-id> --json
```
Requires the `warden` CLI from `~/ops-warden` (`uv tool install .` or `uv run warden`).
| Agent runtime | How to orient |
| --- | --- |
| **Codex / Grok** (shell, HTTP State Hub) | `warden route` commands above; inbox `to_agent=shard-wiki` is for coordination, not secret vending |
| **Claude Code** (MCP when available) | `get_domain_summary("custodian")` for workstreams; **still** use `warden route` for credential ownership |
| **llm-connect** (inference service) | Never put secret retrieval in prompts; route custody to OpenBao/operator paths surfaced by `warden route` |
### Quick routing table
| I need… | Owner | ops-warden executes? |
| --- | --- | --- |
| SSH cert (`adm`/`agt`/`atm`) | ops-warden | **Yes**`warden sign` |
| API key, DB password, provider token | OpenBao (`railiance-platform`) | No — route only |
| Login / OIDC / MFA | key-cape / Keycloak | No — route only |
| Authorization decision | flex-auth | No — route only |
| activity-core → issue-core emission | activity-core + issue-core | No — `warden route show activity-core-issue-sink` |
| SSH tunnel | ops-bridge (+ `cert_command` from warden) | No — route only |
### Anti-patterns (do not do these)
- `POST /messages/` to `ops-warden` asking for `ISSUE_CORE_API_KEY`, `OPENROUTER_API_KEY`, etc.
- Inventing `warden secret`, `warden login`, `warden bao`, `warden tunnel` — they do not exist
- Pasting secrets into Git, State Hub, workplans, logs, or chat
### Other capabilities (reuse-surface)
Non-credential capabilities are usually discovered through **reuse-surface** federation
(`reuse-surface` registry / `capability.*` indexes). Credential routing is inlined in
every repo's agent instructions because it is high-frequency, high-risk, and easy to
get wrong.
**Canon:** `~/ops-warden/wiki/CredentialRouting.md` · catalog `~/ops-warden/registry/routing/catalog.yaml`

View File

@@ -0,0 +1,38 @@
## First Session Protocol
Triggered when `get_domain_summary("consumer")` shows **no workstreams**.
The project is registered but work has not yet been structured.
**Step 1 — Read, don't write**
- `~/the-custodian/canon/projects/consumer/project_charter_v0.1.md` — purpose, scope
- `~/the-custodian/canon/projects/consumer/roadmap_v0.1.md` — planned phases
- Scan repo root: README, directory structure, existing code or docs
**Step 2 — Survey in-progress work**
Look for TODOs, open branches, half-finished files. Note done vs. started but incomplete.
**Step 3 — Propose workstreams to Bernd**
Propose 13 workstreams — each a coherent strand, weeks to months, anchored to a
roadmap phase. **Wait for approval before creating.**
**Step 4 — Create workplan file first, then DB record (ADR-001)**
```
workplans/SHARD-WP-NNNN-<slug>.md ← write this first
```
Then register in the hub:
```
create_workstream(topic_id="4c2e5315-2cb9-447c-9d16-a39bdb0aabd0", title="...", owner="...", description="...")
create_task(workstream_id="<id>", title="...", priority="high|medium|low")
```
**Step 5 — Record the setup**
```
add_progress_event(
summary="First session: structured consumer into N workstreams, M tasks",
event_type="milestone",
topic_id="4c2e5315-2cb9-447c-9d16-a39bdb0aabd0",
detail={"workstreams": [...], "tasks_created": M}
)
```
<!-- Delete or archive this file once past first session -->

View File

@@ -0,0 +1,8 @@
## Repo boundary
This repo owns **shard-wiki** only. It does not own:
<!-- TODO: List what belongs in adjacent repos, e.g.:
- SSH key management → railiance-infra/
- State hub code → state-hub/
-->

View File

@@ -0,0 +1,5 @@
**Purpose:** Git-based Markdown wiki orchestrator and federation layer. Python (src/ layout, hatchling, pytest). Early-stage: scaffold + INTENT.md defined, domain model not yet implemented. See INTENT.md for authoritative scope.
**Domain:** consumer
**Repo slug:** shard-wiki
**Topic ID:** 4c2e5315-2cb9-447c-9d16-a39bdb0aabd0

View File

@@ -0,0 +1,85 @@
## Session Protocol
Dev Hub (State Hub API): http://127.0.0.1:8000
MCP server name in `~/.claude.json`: `dev-hub`
**Step 1 — Orient**
Read the offline-safe brief first — it works without a live hub connection:
```bash
cat .custodian-brief.md
```
Then call the MCP tool for richer cross-domain context when MCP tools are exposed:
```
get_domain_summary("consumer")
```
If MCP tools are unavailable in the current agent session, use the REST API:
```bash
curl -s "http://127.0.0.1:8000/state/summary" | python3 -m json.tool
```
If the hub is offline: `cd ~/state-hub && make api`
**Step 2 — Check inbox**
With MCP tools:
```
get_messages(to_agent="shard-wiki", unread_only=True)
```
Mark read with `mark_message_read(message_id)`. Reply or act on coordination
requests before proceeding.
Without MCP tools:
```bash
curl -s "http://127.0.0.1:8000/messages/?to_agent=shard-wiki&unread_only=true" \
| python3 -m json.tool
curl -s -X PATCH "http://127.0.0.1:8000/messages/<id>/read" \
-H "Content-Type: application/json" -d '{}'
```
**Step 3 — Scan workplans**
```bash
ls workplans/
```
For each file with `status: ready`, `active`, or `blocked`, note pending
`wait`/`todo`/`progress` tasks.
**Step 4 — Present brief**
1. **Active workstreams** for `consumer` — title, task counts, blocking decisions
2. **Pending tasks** from `workplans/` + any `[repo:shard-wiki]` hub tasks
3. **Goal guidance** — if `goal_guidance` in summary:
- `needs_workplan`: surface as top action — *"Repo goal '{title}' has no workplan yet"*
- `alignment_warnings`: flag if active work is not aligned with current goal
4. **Suggested next action** — highest-priority open item
5. **SBOM status** — flag if `last_sbom_at` is unset for this repo
If no workstreams: follow First Session Protocol (`first-session.md`).
**During work:** `record_decision()` · `add_progress_event()` · `resolve_decision()`
> State Hub is a *read model*. Bootstrap tools (`create_workstream`, `create_task`)
> are First Session Protocol only. Work structure belongs in repo files (ADR-001).
**Session close:**
With MCP tools:
```
add_progress_event(summary="...", topic_id="4c2e5315-2cb9-447c-9d16-a39bdb0aabd0", workstream_id="<uuid>")
```
Without MCP tools:
```bash
curl -s -X POST http://127.0.0.1:8000/progress/ \
-H "Content-Type: application/json" \
-d '{"topic_id":"4c2e5315-2cb9-447c-9d16-a39bdb0aabd0","workstream_id":"<uuid>","event_type":"note","summary":"what changed","author":"codex"}'
```
If workplan files were modified, ensure the local copy is up to date first:
```bash
git -C <repo_path> pull --ff-only
cd ~/state-hub && make fix-consistency REPO=shard-wiki
```
For repos where implementation runs on a remote machine (e.g. CoulombCore),
use the combined target which pulls before fixing:
```bash
cd ~/state-hub && make fix-consistency-remote REPO=shard-wiki
```
**C-15** (DB task ahead of file) is normal in multi-machine workflows — writeback
will sync the file to match DB. **C-16** (repo behind remote) blocks all writes
until you pull — intentional to prevent clobbering remote progress.

View File

@@ -0,0 +1,19 @@
## Stack
<!-- TODO: Fill in language, frameworks, and key dependencies -->
- **Language:**
- **Key deps:**
## Dev Commands
```bash
# TODO: Fill in the standard commands for this repo
# Install dependencies
# Run tests
# Lint / type check
# Build / package (if applicable)
```

View File

@@ -0,0 +1,40 @@
## Workplan Convention (ADR-001)
File location: `workplans/SHARD-WP-NNNN-<slug>.md`
ID prefix: `SHARD-WP-`
Work items originate as files in this repo **before** being registered in the hub.
Canonical workplan/workstream frontmatter statuses are:
`proposed`, `ready`, `active`, `blocked`, `backlog`, `finished`, `archived`.
Use `proposed` for a newly drafted plan, `ready` after review against current
repo state, and `finished` when implementation is complete. `stalled` and
`needs_review` are derived health labels, not stored statuses.
Closed workplans may be moved to `workplans/archived/` with a completion-date
prefix: `YYMMDD-SHARD-WP-NNNN-<slug>.md`. The frontmatter id remains
unchanged; the prefix is only for quick visual reference.
Small opportunistic tasks discovered during another session use **Ad Hoc Tasks**:
`workplans/ADHOC-YYYY-MM-DD.md`, workstream slug `adhoc-YYYY-MM-DD`, and task ids
`ADHOC-YYYY-MM-DD-T01`, `T02`, etc. Use adhocs only for low-risk work completed
directly. Promote anything requiring analysis, design, approval, dependencies, or
multiple planned phases into a normal workplan.
Ecosystem todos from other agents arrive as `[repo:shard-wiki]` hub tasks —
visible at session start. Pick one up by creating the workplan file, then registering
the workstream.
Task blocks use this shape:
```task
id: SHARD-WP-NNNN-T01
status: wait | todo | progress | done | cancel
priority: high | medium | low
state_hub_task_id: "<uuid>" # written by fix-consistency — do not edit
```
Status progression is `todo``progress``done`; use `wait` for waiting or
blocked work and `cancel` for stopped work.
<!-- Ralph Loop rules and HEUREKA sequence: ~/.claude/CLAUDE.md — do not duplicate here -->

17
.repo-classification.yaml Normal file
View File

@@ -0,0 +1,17 @@
repo_classification:
standard: Repo Classification Standard
version: '1.0'
classified_at: '2026-06-22'
classified_by: agent
category: project
domain: consumer
secondary_domains: []
capability_tags:
- knowledge
- documentation
business_stake:
- product
- experience
business_mechanics:
- coordination
- operation

243
AGENTS.md
View File

@@ -1,62 +1,219 @@
# AGENTS.md
# shard-wiki — Agent Instructions
Guidance for agents working in `shard-wiki`.
## Repo Identity
## Read First
**Purpose:** Git-based Markdown wiki orchestrator and federation layer. Python (src/ layout, hatchling, pytest). Early-stage: scaffold + INTENT.md defined, domain model not yet implemented. See INTENT.md for authoritative scope.
1. `INTENT.md` — aspiration and boundaries (stable; architectural changes are rare).
2. `SCOPE.md` — what we are achieving now and current maturity.
3. `.custodian-brief.md` — State Hub snapshot (generated; do not edit manually).
**Domain:** consumer
**Repo slug:** shard-wiki
**Topic ID:** `4c2e5315-2cb9-447c-9d16-a39bdb0aabd0`
**Workplan prefix:** `SHARD-WP-`
## Documentation Layout
---
This repo follows the CoulombSocial / HelixForge / MarkiTect documentation
layout (recommendation, not strict law). Efficient retrieval by purpose:
## State Hub Integration
| Path | Purpose |
|------|---------|
| `INTENT.md` | Aspiration and boundaries |
| `SCOPE.md` | Top-level view of current achievement; closes gap to INTENT |
| `research/` | Exploration results (`yymmdd-` prefix on files or subdirs) |
| `demand/` | Inbound requests not yet reviewed into spec or workplans |
| `spec/` | Implementation guardrails (PRD, TSD, use cases, architecture) |
| `workplans/` | State Hubregistered implementation tasks |
| `docs/` | Stakeholder documentation (users, developers, humans, agents) |
| `wiki/` | Perspective-free interconnected knowledge (wiki UI when connected) |
| `issues/` | Mirror of relevant open tickets when ticket systems are in use |
| `history/` | Archived material (`yymmdd-` prefix); out of scope for daily work |
The Custodian State Hub tracks work across all domains. Interact via HTTP REST —
there is no MCP server for Codex agents.
**Mode of operation:** close SCOPE → INTENT while learning; refine both as needed.
| Context | URL |
|---------|-----|
| Local workstation | `http://127.0.0.1:8000` |
| Remote via tunnel | `http://127.0.0.1:18000` |
## Domain Vocabulary
Honor terms from `INTENT.md`: shard, root entity, adapter contract, projection,
overlay, coordination journal, shard modes. Do not invent parallel vocabulary.
## Build And Test
### Orient at session start
```bash
pip install -e ".[dev]"
pytest
ruff check
ruff format
# Offline brief — works without hub connection
cat .custodian-brief.md
# Active workstreams for this domain
curl -s "http://127.0.0.1:8000/workstreams/?topic_id=4c2e5315-2cb9-447c-9d16-a39bdb0aabd0&status=active" \
| python3 -m json.tool
# Check inbox
curl -s "http://127.0.0.1:8000/messages/?to_agent=shard-wiki&unread_only=true" \
| python3 -m json.tool
```
## State Hub
Mark a message read:
```bash
curl -s -X PATCH "http://127.0.0.1:8000/messages/<id>/read" \
-H "Content-Type: application/json" -d '{}'
```
Workplans register with State Hub. After workplan changes:
### Log progress (required at session close)
```bash
cd ~/state-hub && make fix-consistency REPO=shard-wiki
curl -s -X POST http://127.0.0.1:8000/progress/ \
-H "Content-Type: application/json" \
-d '{
"summary": "what was done",
"event_type": "note",
"author": "codex",
"workstream_id": "<uuid>",
"task_id": "<uuid>"
}'
```
Finished or canceled workplans move to `history/` with a `yymmdd-` archive prefix.
Omit `workstream_id` / `task_id` when not applicable.
## Where To Put New Material
### Update task status
- Exploratory analysis → `research/yymmdd-<topic>/`
- Raw feature ask or external requirement → `demand/`
- Reviewed design ready to guide code → `spec/`
- Implementation tasks → `workplans/`
- User/dev/agent how-to → `docs/`
- Collaborative unstructured notes → `wiki/`
```bash
curl -s -X PATCH "http://127.0.0.1:8000/tasks/<task_id>" \
-H "Content-Type: application/json" \
-d '{"status": "progress"}'
# values: wait | todo | progress | done | cancel
```
### Flag a task for human review
```bash
curl -s -X PATCH "http://127.0.0.1:8000/tasks/<task_id>" \
-H "Content-Type: application/json" \
-d '{"needs_human": true, "intervention_note": "reason"}'
```
---
## Session Protocol
**Start:**
1. `cat .custodian-brief.md` — domain goal and open workstreams (offline-safe)
2. Check inbox: `GET /messages/?to_agent=shard-wiki&unread_only=true`; mark read
3. Scan workplans: `ls workplans/` — note `status: ready`, `active`, or `blocked` files and open tasks
4. Check human-needed tasks: `GET /tasks/?needs_human=true`
**During work:**
- Update task statuses in workplan files as tasks progress
- Record significant decisions via `POST /decisions/`
**Close:**
1. Update workplan file task statuses to reflect progress
2. Log: `POST /progress/` with a summary of what changed
3. Note for the custodian operator: after workplan file changes, run from
`~/state-hub`:
```bash
make fix-consistency REPO=shard-wiki
```
This syncs task status from files into the hub DB.
---
## Credential and access routing
**Audience:** Codex, Claude Code, Grok, and custodian agents that call **llm-connect**
for inference. Run this check **before** requesting secrets, API keys, SSH access,
login tokens, or database passwords — in any repo, not only `ops-warden`.
ops-warden **issues SSH certificates only** (`warden sign`, `cert_command`). Every
other credential need belongs to another subsystem. **Do not** message
`ops-warden` on State Hub expecting a secret value; the reply is a pointer, not a key.
### Lookup (do this first)
```bash
warden route find "<describe your need>" --json
warden route show <catalog-id> --json
```
Requires the `warden` CLI from `~/ops-warden` (`uv tool install .` or `uv run warden`).
| Agent runtime | How to orient |
| --- | --- |
| **Codex / Grok** (shell, HTTP State Hub) | `warden route` commands above; inbox `to_agent=shard-wiki` is for coordination, not secret vending |
| **Claude Code** (MCP when available) | `get_domain_summary("custodian")` for workstreams; **still** use `warden route` for credential ownership |
| **llm-connect** (inference service) | Never put secret retrieval in prompts; route custody to OpenBao/operator paths surfaced by `warden route` |
### Quick routing table
| I need… | Owner | ops-warden executes? |
| --- | --- | --- |
| SSH cert (`adm`/`agt`/`atm`) | ops-warden | **Yes** — `warden sign` |
| API key, DB password, provider token | OpenBao (`railiance-platform`) | No — route only |
| Login / OIDC / MFA | key-cape / Keycloak | No — route only |
| Authorization decision | flex-auth | No — route only |
| activity-core → issue-core emission | activity-core + issue-core | No — `warden route show activity-core-issue-sink` |
| SSH tunnel | ops-bridge (+ `cert_command` from warden) | No — route only |
### Anti-patterns (do not do these)
- `POST /messages/` to `ops-warden` asking for `ISSUE_CORE_API_KEY`, `OPENROUTER_API_KEY`, etc.
- Inventing `warden secret`, `warden login`, `warden bao`, `warden tunnel` — they do not exist
- Pasting secrets into Git, State Hub, workplans, logs, or chat
### Other capabilities (reuse-surface)
Non-credential capabilities are usually discovered through **reuse-surface** federation
(`reuse-surface` registry / `capability.*` indexes). Credential routing is inlined in
every repo's agent instructions because it is high-frequency, high-risk, and easy to
get wrong.
**Canon:** `~/ops-warden/wiki/CredentialRouting.md` · catalog `~/ops-warden/registry/routing/catalog.yaml`
<!-- REPO-AGENTS-EXTENSIONS -->
<!-- Append repo-specific agent instructions below this marker.
The state-hub template sync preserves content after this line. -->
---
## Workplan Convention (ADR-001)
Work items originate as files in this repo — not in the hub. The hub is a
read/cache/index layer that rebuilds from files.
**File location:** `workplans/SHARD-WP-NNNN-<slug>.md`
**Archived location:** finished workplans may move to
`workplans/archived/YYMMDD-SHARD-WP-NNNN-<slug>.md`. The `YYMMDD` prefix is
the completion/archive date; the frontmatter `id` does not change.
**Ad Hoc Tasks:** small opportunistic fixes discovered during a session use
`workplans/ADHOC-YYYY-MM-DD.md` with task ids `ADHOC-YYYY-MM-DD-T01`, etc. Use
this only for low-risk work completed directly; create a normal workplan for
anything needing analysis, design, approval, dependencies, or multiple phases.
**Frontmatter:**
```yaml
---
id: SHARD-WP-NNNN
type: workplan
title: "..."
domain: consumer
repo: shard-wiki
status: proposed | ready | active | blocked | backlog | finished | archived
owner: codex
topic_slug: ...
created: "YYYY-MM-DD"
updated: "YYYY-MM-DD"
state_hub_workstream_id: "<uuid>" # written by fix-consistency — do not edit
---
```
Use `proposed` for a new draft, `ready` after review against current repo
state, and `finished` after implementation. `stalled` and `needs_review` are
derived health labels, not frontmatter statuses.
**Task block format** (one per `##` section):
```
## Task Title
` ` `task
id: SHARD-WP-NNNN-T01
status: wait | todo | progress | done | cancel
priority: high | medium | low
state_hub_task_id: "<uuid>" # written by fix-consistency — do not edit
` ` `
Task description text.
```
Status progression: `todo` → `progress` → `done`; use `wait` for waiting/blocked work and `cancel` for stopped work.
To create a new workplan:
1. Write the file following the format above
2. Notify the custodian operator to run `make fix-consistency REPO=shard-wiki`
(or send a message to the hub agent via `POST /messages/`)

View File

@@ -1,53 +1,12 @@
# CLAUDE.md
# shard-wiki — Claude Code Instructions
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
## Repository status
This is an **early-stage Python repository**. The package scaffold (`src/shard_wiki/`, `tests/`, `pyproject.toml`) exists with only smoke tests — the domain model is not yet implemented. Read `INTENT.md` (aspiration), `SCOPE.md` (current achievement), and `AGENTS.md` (layout and conventions) before designing anything. Close the gap from SCOPE to INTENT via `research/`, `spec/`, and `workplans/`.
## What this project is
`shard-wiki` is a **Git-based Markdown wiki orchestrator and federation layer**, not a wiki engine. It lets multiple heterogeneous wiki-shaped page stores (**shards**) attach to a shared root entity and be presented as a **union of pages**, while preserving each shard's separate storage, provenance, capabilities, and history.
The core job is orchestration across backends — Git repos, repo subdirectories (`wiki/`), Gitea wikis, local folders, Obsidian vaults, WebDAV/Nextcloud directories, Coulomb spaces — never replacing or homogenizing them.
## Core domain model (the concepts code must honor)
These abstractions come from `INTENT.md` and define the architecture. New code should map onto them rather than inventing parallel vocabulary:
- **Shard** — an independently meaningful page store attached to a root entity. Shards have *sovereignty*: their own backend, capabilities, limits, history, and identity model. Not all shards are Git-native.
- **Root entity / information space** — the joined space that shards attach to. Each information space should have a **Git-addressable coordination layer** (history, patches, review, backup, reconciliation) even when individual shards are not Git-native.
- **Shard adapter contract** — the versioned interface a backend implements to participate. Adapters are **capability-aware**: the core must model explicitly which operations a shard supports (read, write, diff, merge, lock, version, publish, accept patches) rather than assuming uniformity.
- **Wiki page model** — a stable, versioned, Markdown-first but backend-neutral representation of pages, paths, links, metadata, revisions.
- **Projection** — a lazy, cache-like local view of remote/external shard content. Prefer lazy projection over eager copying.
- **Overlay** — a non-destructive local edit against a remote, read-only, or capability-limited shard, representable as drafts/patches/commits/merge requests *before* destructive application ("overlay before mutation").
- **Coordination journal** — the Git-backed record of change flows for an information space.
- **Shard modes** — read-only, write-through, mirrored, projected, cached, canonical.
## Design constraints to enforce in code
These are hard boundaries from `INTENT.md`; treat violations as design bugs:
- **Mechanism over policy.** Provide primitives for federation, sync, overlays, patching, conflict detection, projection, reconciliation. Do *not* hard-code one editorial/sync/conflict/canonical-source policy — keep those configurable.
- **Union without erasure.** Always preserve provenance: which shard a page came from, its freshness, whether it is cached, whether it has overlays, whether it diverges from an equivalent page elsewhere. Never hide authorship, conflicts, freshness, or backend limitations.
- **No silent remote mutation.** Do not mutate remote systems without explicit adapter support and user intent.
- **Graceful degradation.** Limited backends must still be usable as read-only/cache/projection/backup/patch targets.
- **Not a file-sync daemon.** Synchronization is wiki-page-semantic, not generic file mirroring.
`INTENT.md` has a "Stability Note": changes that redefine what a shard is, Git's role, how root entities are modeled, or whether this is an orchestrator vs. an engine are **architectural changes** and should be rare and deliberate.
## Build, test, run
Python with a `src/` layout, built via hatchling, tested with pytest. Tests run against the source tree directly (`pythonpath = ["src"]` in `pyproject.toml`), so no install/editable step is required to run them.
```bash
pip install -e ".[dev]" # one-time: install dev tooling (pytest, pytest-cov, ruff)
pytest # run the full test suite
pytest tests/test_package.py::test_version_is_exposed # run a single test
pytest --cov # run with coverage
ruff check # lint
ruff format # format
```
Note: the system `pytest` is 7.4.x; `minversion` in `pyproject.toml` is pinned to `7.0` to match. Bump it if a newer pytest is installed into the dev environment.
@SCOPE.md
@.claude/rules/repo-identity.md
@.claude/rules/session-protocol.md
@.claude/rules/first-session.md
@.claude/rules/workplan-convention.md
@.claude/rules/stack-and-commands.md
@.claude/rules/architecture.md
@.claude/rules/repo-boundary.md
@.claude/rules/credential-routing.md
@.claude/rules/agents.md

View File

@@ -17,7 +17,7 @@ Learnings update both SCOPE and INTENT where necessary.
| Layer | State |
|-------|-------|
| Code | Foundation slice implemented (SHARD-WP-0007): `provenance` + `policy` leaves, `model` (Identity/Placement/Span/Page/CapabilityProfile), `adapters` (contract + FolderAdapter + conformance suite), `coordination` (event-sourced DecisionLog), `union` (resolution + chorus, overlay-aware), `InformationSpace` orchestrator. Write path added (SHARD-WP-0008): writable adapter, overlay engine (draft→patch→apply-under-drift), edit() unifies write-through + overlay-before-mutation. Native engine implemented (SHARD-WP-0014): `engine` (kernel + typed-extension runtime + per-shard activation [ADR-0001] + capability-profile-from-extensions + EngineShardAdapter + the `ext.struct` built-in) — an engine shard attaches to an InformationSpace as a canonical-mode shard. Git-backed coordination log (SHARD-WP-0009): `DecisionLog` storage factored behind an `EventStore`; `GitEventStore` makes the log git-addressable (each space a ref, append = immutable CAS-guarded commit), a per-space `AppendAuthority` (lease) gives a single-writer total order with re-grantable HA hand-off, cross-process read-your-writes verified, and a verbatim one-time importer (`migrate_space`/JSONL) replays in-memory logs into git; `InformationSpace.git_backed(...)` wires it. 128 tests green, ~97% coverage |
| Code | Foundation slice implemented (SHARD-WP-0007): `provenance` + `policy` leaves, `model` (Identity/Placement/Span/Page/CapabilityProfile), `adapters` (contract + FolderAdapter + conformance suite), `coordination` (event-sourced DecisionLog), `union` (resolution + chorus, overlay-aware), `InformationSpace` orchestrator. Write path added (SHARD-WP-0008): writable adapter, overlay engine (draft→patch→apply-under-drift), edit() unifies write-through + overlay-before-mutation. Native engine implemented (SHARD-WP-0014): `engine` (kernel + typed-extension runtime + per-shard activation [ADR-0001] + capability-profile-from-extensions + EngineShardAdapter + the `ext.struct` built-in) — an engine shard attaches to an InformationSpace as a canonical-mode shard. Git-backed coordination log (SHARD-WP-0009): `DecisionLog` storage factored behind an `EventStore`; `GitEventStore` makes the log git-addressable (each space a ref, append = immutable CAS-guarded commit), a per-space `AppendAuthority` (lease) gives a single-writer total order with re-grantable HA hand-off, cross-process read-your-writes verified, and a verbatim one-time importer (`migrate_space`/JSONL) replays in-memory logs into git; `InformationSpace.git_backed(...)` wires it. Derived views (SHARD-WP-0010): `views` (wikilink + red-link model, BackLinks, RecentChanges, AllPages/SiteMap) — recomputable, provenance-carrying, presentation-free, exposed via `InformationSpace.backlinks/recent_changes/all_pages/site_map`. Incremental-first derived tier (SHARD-WP-0011): `incremental` (indexed equivalence via MinHash/LSH blocking + verify, change-driven delta maintenance with retraction/propagation, Merkle-style digest + self-healing I-2 consistency-checker, `UnionIndex` routed behind `InformationSpace.all_pages` with rebuild as explicit fallback). Second adapter (SHARD-WP-0012): `GitShardAdapter` — git-IS-store substrate (read=tracked *.md, write=commit, current_rev=per-path sha for drift, adopted git-native history), passes conformance, works across folder+git shards in union/overlay/edit with no core change (capability-as-data proven on a second substrate). 196 tests green, ~97% coverage |
| Intent | `INTENT.md` established; authorization-in-core amendments drafted |
| Research | yawex prior art; c2 origins; federation concepts; wikiengines overview (`research/260608-*/`); XWiki/TWiki/Foswiki deep dives (`research/260613-*/`); Xanadu + ZigZag + Roam + Obsidian + Notion + Joplin + Logseq + local-first workspaces (Anytype/AFFiNE/AppFlowy) + Trilium + Wiki.js + Federated Wiki + Wikibase + git-forge wikis + TiddlyWiki + ikiwiki + Quip + MojoMojo + Oddmuse + UseModWiki deep dives & shard-spectrum synthesis (`research/260614-*/`) |
| Demand | NetKingdom integration asks captured, not yet negotiated |

View File

@@ -9,10 +9,13 @@ from shard_wiki.adapters.conformance import (
)
from shard_wiki.adapters.contract import CONTRACT_VERSION, ShardAdapter
from shard_wiki.adapters.folder import FolderAdapter
from shard_wiki.adapters.git import GitShardAdapter, PageRevision
__all__ = [
"ShardAdapter",
"FolderAdapter",
"GitShardAdapter",
"PageRevision",
"CONTRACT_VERSION",
"Check",
"ConformanceReport",

View File

@@ -0,0 +1,180 @@
"""GitShardAdapter — a second substrate: git-as-store (SHARD-WP-0012; TSD §A.3 git-IS-store).
The home case where **git is the store *and* the journal**. Tracked ``*.md`` paths are the page
keys; the working-tree file is the body; a page's ``source_rev`` is the **commit sha of the last
commit touching its path** (per-path, so an edit to one page never drifts another). The declared
profile is *git-IS-store ⟹ substrate=git ∧ history=git-native* — the implication rule the
capability model enforces (§6.5), validated at registration like any other binding.
This adapter adds **no core changes**: it implements the same :class:`ShardAdapter` contract the
folder adapter does, proving "write an adapter + declare a verified profile" is the whole cost of a
new substrate (capability-as-data, I-3). Built on the ``git`` CLI via subprocess — zero new deps.
"""
from __future__ import annotations
import os
import subprocess
from collections.abc import Iterable
from dataclasses import dataclass
from pathlib import Path
from shard_wiki.adapters.contract import ShardAdapter
from shard_wiki.model import (
AccessGrant,
Addressing,
AttachmentMode,
CapabilityProfile,
ContentOpacity,
History,
Identity,
MergeModel,
NativeQuery,
NotSupported,
OperationalEnvelope,
Page,
Placement,
Substrate,
Translation,
Verb,
WriteGranularity,
)
from shard_wiki.provenance import Liveness, ProvenanceEnvelope, Staleness
__all__ = ["GitShardAdapter", "PageRevision"]
@dataclass(frozen=True, slots=True)
class PageRevision:
"""One adopted git-native revision of a page: the commit sha and its subject line."""
sha: str
message: str
_GIT_IDENTITY = {
"GIT_AUTHOR_NAME": "shard-wiki",
"GIT_AUTHOR_EMAIL": "shard@shard-wiki",
"GIT_COMMITTER_NAME": "shard-wiki",
"GIT_COMMITTER_EMAIL": "shard@shard-wiki",
}
class GitShardAdapter(ShardAdapter):
"""A shard whose store is a git repo: keys are tracked ``*.md`` paths, revs are commit shas."""
def __init__(self, shard_id: str, repo_path: str | Path, writable: bool = False) -> None:
self._shard_id = shard_id
self._repo = Path(repo_path)
self._writable = writable
self._repo.mkdir(parents=True, exist_ok=True)
if not (self._repo / ".git").exists():
self._git("init", "--quiet")
@property
def shard_id(self) -> str:
return self._shard_id
def profile(self) -> CapabilityProfile:
# VERSION is always available — a git-IS-store has git-native history to adopt (§A.5),
# read-only or not. WRITE (= commit, PER_PAGE) is added only in writable mode.
verbs = {Verb.READ, Verb.VERSION}
granularity = WriteGranularity.NONE
if self._writable:
verbs |= {Verb.WRITE}
granularity = WriteGranularity.PER_PAGE
return CapabilityProfile(
substrate=Substrate.GIT,
attachment_mode=AttachmentMode.GIT_IS_STORE,
write_granularity=granularity,
content_opacity=ContentOpacity.TRANSPARENT,
operational_envelope=OperationalEnvelope.LOCAL_UNBOUNDED,
access_grant=AccessGrant.OPEN,
liveness=Liveness.STATIC,
history=History.GIT_NATIVE, # git-is-store ⟹ git-native (§6.5)
merge_model=MergeModel.GIT_TEXT,
addressing=Addressing.PATH,
native_query=NativeQuery.NONE,
translation=Translation.NATIVE,
supported_verbs=frozenset(verbs),
).validate()
def write(self, key: str, body: str) -> Page:
"""Write = **commit**: stage the file and commit it (skip a no-op so no empty commit),
returning the page at the new sha. Drift detection rides on ``current_rev`` = that sha."""
if not self._writable:
raise NotSupported(f"{type(self).__name__} is read-only")
rel = f"{key}.md"
path = self._path_for(key)
path.parent.mkdir(parents=True, exist_ok=True)
path.write_text(body, encoding="utf-8")
self._git("add", "--", rel)
if self._run("diff", "--cached", "--quiet").returncode != 0: # staged changes present
self._git("commit", "-m", f"write {rel}", env=_GIT_IDENTITY)
return self.read(key)
def keys(self) -> Iterable[str]:
out = self._git("ls-files", "*.md").decode()
for line in out.splitlines():
yield line[: -len(".md")] if line.endswith(".md") else line
def read(self, key: str) -> Page:
path = self._path_for(key)
if not path.is_file():
raise KeyError(key)
rev = self.current_rev(key)
return Page(
identity=Identity(self._shard_id, key),
body=path.read_text(encoding="utf-8"),
envelope=ProvenanceEnvelope(
source_shard=self._shard_id,
liveness=Liveness.STATIC,
staleness=Staleness.FRESH,
source_rev=rev,
lineage="git-native",
),
placements=(Placement(self._shard_id, f"{key}.md"),),
)
def current_rev(self, key: str) -> str | None:
"""The sha of the last commit touching ``key``'s path (per-path drift token), or None."""
rel = f"{key}.md"
if not self._path_for(key).is_file():
return None
sha = self._git("log", "-1", "--format=%H", "--", rel).decode().strip()
return sha or None
def history(self, key: str) -> tuple[PageRevision, ...]:
"""Adopt git-native history (§A.5): the commit list for ``key``'s path, newest-first.
VERSION-gated; raises ``KeyError`` for an unknown page. Each revision is a commit sha +
subject — the native log surfaced through the contract, not re-implemented.
"""
if not self.profile().supports(Verb.VERSION):
raise NotSupported(f"{type(self).__name__} does not support version")
if not self._path_for(key).is_file():
raise KeyError(key)
out = self._git("log", "--format=%H%x00%s", "--", f"{key}.md").decode()
revisions = []
for line in out.splitlines():
sha, _, message = line.partition("\x00")
revisions.append(PageRevision(sha=sha, message=message))
return tuple(revisions)
# -- git plumbing --------------------------------------------------------
def _path_for(self, key: str) -> Path:
return self._repo / f"{key}.md"
def _git(self, *args: str, stdin: bytes | None = None, env: dict | None = None) -> bytes:
return self._run(*args, stdin=stdin, env=env, check=True).stdout
def _run(
self, *args: str, stdin: bytes | None = None, env: dict | None = None, check: bool = False
) -> subprocess.CompletedProcess:
return subprocess.run(
["git", "-C", str(self._repo), *args],
input=stdin,
capture_output=True,
env={**os.environ, **(env or {})},
check=check,
)

View File

@@ -0,0 +1,46 @@
"""incremental/ — the incremental-first derived tier (CoreArchitectureBlueprint §8.7).
Equivalence is **indexed** (blocking/LSH + verify), not pairwise O(N²); maintenance is
**change-driven** (delta with retraction + propagation, review B-4), keeping the derived tier equal
to a from-scratch rebuild — which becomes a bounded fallback, not the operational path. A
Merkle-style **digest** plus a background **consistency-checker** make ``derived = f(canonical)``
verified rather than asserted (I-2), self-healing on detected drift.
In-memory only for this slice (no persisted index store); per-partition structure is honoured but
multi-tenant deployment is later. Per the dependency rule this imports down (model/provenance) and
is wired by the orchestrator.
"""
from shard_wiki.incremental.equivalence import (
EquivalenceEdge,
EquivalenceIndex,
normalized_title,
)
from shard_wiki.incremental.minhash import (
MinHasher,
band_keys,
jaccard,
shingles,
)
from shard_wiki.incremental.union_index import UnionIndex
from shard_wiki.incremental.verification import (
ConsistencyChecker,
ConsistencyReport,
derived_digest,
region_digest,
)
__all__ = [
"shingles",
"MinHasher",
"band_keys",
"jaccard",
"EquivalenceEdge",
"EquivalenceIndex",
"normalized_title",
"derived_digest",
"region_digest",
"ConsistencyReport",
"ConsistencyChecker",
"UnionIndex",
]

View File

@@ -0,0 +1,225 @@
"""Indexed equivalence — blocking + verify, incrementally maintained (SHARD-WP-0011 T1/T2).
Equivalence (two *distinct* identities holding the same page) is detected without pairwise O(N²):
1. **Blocking** generates candidate pairs — pages sharing a normalized-title bucket or an LSH band
(MinHash over content shingles).
2. **Verify** confirms a candidate — exact-body fingerprint match, or shingle Jaccard ≥ threshold —
plus **curator bindings** (explicit decision-log edges) which are always equivalence edges.
The index is **incrementally maintained** (T2): ``add`` / ``update`` / ``remove`` re-bucket the
changed page, **retract** the edges it leaves and **add** the edges it enters; equivalence groups
are the connected components of the current edge set, so a retraction that disconnects a component
**splits** a chorus automatically. A full :meth:`build` is just repeated ``add`` — the bounded
rebuild fallback. The invariant (and the test oracle): incremental state == a from-scratch rebuild.
"""
from __future__ import annotations
import hashlib
import re
from collections.abc import Iterable
from dataclasses import dataclass
from shard_wiki.incremental.minhash import MinHasher, band_keys, jaccard, shingles
from shard_wiki.model import Identity, Page
__all__ = ["EquivalenceEdge", "EquivalenceIndex", "normalized_title"]
_NONALNUM_RE = re.compile(r"[^a-z0-9]+")
def normalized_title(key: str) -> str:
"""A blocking bucket key: the last path segment, lowercased, stripped of non-alphanumerics."""
leaf = key.rsplit("/", 1)[-1]
return _NONALNUM_RE.sub("", leaf.lower())
@dataclass(frozen=True, slots=True)
class EquivalenceEdge:
"""A verified equivalence between two identities, tagged with why it was accepted."""
a: Identity
b: Identity
reason: str # "fingerprint" | "content" | "curator"
@dataclass(frozen=True, slots=True)
class _Entry:
shingle_set: frozenset[str]
bands: tuple[tuple[int, tuple[int, ...]], ...]
title: str
fingerprint: str
def _fingerprint(body: str) -> str:
return hashlib.blake2b(body.strip().encode("utf-8"), digest_size=16).hexdigest()
def _pair(a: Identity, b: Identity) -> frozenset[Identity]:
return frozenset((a, b))
class EquivalenceIndex:
"""An incrementally maintained, blocked-and-verified equivalence relation over union pages."""
def __init__(
self,
*,
num_perm: int = 64,
num_bands: int = 32,
threshold: float = 0.7,
hasher: MinHasher | None = None,
) -> None:
self.threshold = threshold
self.num_bands = num_bands
self._hasher = hasher or MinHasher(num_perm=num_perm)
self._entries: dict[Identity, _Entry] = {}
self._band_buckets: dict[tuple[int, tuple[int, ...]], set[Identity]] = {}
self._title_buckets: dict[str, set[Identity]] = {}
self._content_edges: dict[frozenset[Identity], str] = {}
self._curator_edges: set[frozenset[Identity]] = set()
# -- build / maintain ----------------------------------------------------
def build(
self,
pages: Iterable[Page],
curator_edges: Iterable[tuple[Identity, Identity]] = (),
) -> None:
"""Rebuild from scratch (the bounded fallback): add every page, then curator edges."""
self.__init__(
num_bands=self.num_bands, threshold=self.threshold, hasher=self._hasher
)
for page in pages:
self.add(page)
for a, b in curator_edges:
self.bind(a, b)
def add(self, page: Page) -> None:
"""Index a new (or, via :meth:`update`, refreshed) page and add its equivalence edges."""
identity = page.identity
entry = self._make_entry(page)
self._entries[identity] = entry
for key in entry.bands:
self._band_buckets.setdefault(key, set()).add(identity)
self._title_buckets.setdefault(entry.title, set()).add(identity)
for candidate in self._candidates(identity, entry):
reason = self._verify(identity, candidate)
if reason is not None:
self._content_edges[_pair(identity, candidate)] = reason
def remove(self, identity: Identity) -> None:
"""Drop a page: de-bucket it and retract every content edge incident to it."""
entry = self._entries.pop(identity, None)
if entry is None:
return
for key in entry.bands:
self._discard_bucket(self._band_buckets, key, identity)
self._discard_bucket(self._title_buckets, entry.title, identity)
for edge in [e for e in self._content_edges if identity in e]:
del self._content_edges[edge]
def update(self, page: Page) -> None:
"""Apply a change as retract-then-add: stale (bucket-exit) edges go, new edges arrive."""
self.remove(page.identity)
self.add(page)
def bind(self, a: Identity, b: Identity) -> None:
"""Record a curator equivalence (an explicit decision-log binding); always an edge."""
if a != b:
self._curator_edges.add(_pair(a, b))
def unbind(self, a: Identity, b: Identity) -> None:
self._curator_edges.discard(_pair(a, b))
def set_curator_edges(self, edges: Iterable[tuple[Identity, Identity]]) -> None:
"""Replace all curator edges at once (re-syncing from the decision-log fold)."""
self._curator_edges = {_pair(a, b) for a, b in edges if a != b}
# -- queries -------------------------------------------------------------
def identities(self) -> frozenset[Identity]:
"""All identities currently present in the index."""
return frozenset(self._entries)
def fingerprint(self, identity: Identity) -> str | None:
"""The content fingerprint indexed for ``identity`` (None if absent) — a digest leaf."""
entry = self._entries.get(identity)
return entry.fingerprint if entry is not None else None
def edges(self) -> frozenset[frozenset[Identity]]:
"""All equivalence edges (content + curator) among currently present identities."""
present = self._entries.keys()
curator = {e for e in self._curator_edges if e <= present}
return frozenset(set(self._content_edges) | curator)
def groups(self) -> tuple[frozenset[Identity], ...]:
"""Equivalence groups: connected components of size ≥ 2 (union-find over the edges)."""
parent: dict[Identity, Identity] = {}
def find(x: Identity) -> Identity:
parent.setdefault(x, x)
root = x
while parent[root] != root:
root = parent[root]
while parent[x] != root:
parent[x], x = root, parent[x]
return root
for edge in self.edges():
a, b = tuple(edge)
ra, rb = find(a), find(b)
if ra != rb:
parent[ra] = rb
comps: dict[Identity, set[Identity]] = {}
for node in parent:
comps.setdefault(find(node), set()).add(node)
return tuple(
frozenset(members) for members in comps.values() if len(members) > 1
)
def equivalent_to(self, identity: Identity) -> frozenset[Identity]:
"""The equivalence group containing ``identity`` (including itself), else just itself."""
for group in self.groups():
if identity in group:
return group
return frozenset({identity})
# -- internals -----------------------------------------------------------
def _make_entry(self, page: Page) -> _Entry:
shingle_set = shingles(page.body)
signature = self._hasher.signature(shingle_set)
return _Entry(
shingle_set=shingle_set,
bands=band_keys(signature, self.num_bands),
title=normalized_title(page.identity.key),
fingerprint=_fingerprint(page.body),
)
def _candidates(self, identity: Identity, entry: _Entry) -> set[Identity]:
candidates: set[Identity] = set()
for key in entry.bands:
candidates |= self._band_buckets.get(key, set())
candidates |= self._title_buckets.get(entry.title, set())
candidates.discard(identity)
return candidates
def _verify(self, a: Identity, b: Identity) -> str | None:
ea, eb = self._entries[a], self._entries[b]
if ea.fingerprint == eb.fingerprint:
return "fingerprint"
if jaccard(ea.shingle_set, eb.shingle_set) >= self.threshold:
return "content"
return None
@staticmethod
def _discard_bucket(buckets: dict, key, identity: Identity) -> None:
bucket = buckets.get(key)
if bucket is not None:
bucket.discard(identity)
if not bucket:
del buckets[key]

View File

@@ -0,0 +1,71 @@
"""MinHash + LSH banding primitives for content-similarity blocking (SHARD-WP-0011 T1).
Pure, deterministic functions (fixed hashing, no per-run randomness) so the derived tier and its
digest are reproducible. Shingle a body into k-grams, MinHash the shingle set into a signature,
split the signature into LSH bands; two pages sharing a band are *candidates* for equivalence —
the cheap pre-filter that replaces pairwise O(N²) comparison.
"""
from __future__ import annotations
import hashlib
import random
import re
from collections.abc import Iterable
__all__ = ["shingles", "MinHasher", "band_keys", "jaccard"]
_WORD_RE = re.compile(r"\w+")
# Largest Mersenne prime below 2**61 — the modulus for the universal-hash permutations.
_PRIME = (1 << 61) - 1
def shingles(text: str, k: int = 3) -> frozenset[str]:
"""The set of word k-grams in ``text`` (lowercased). Short texts fall back to their word set."""
words = _WORD_RE.findall(text.lower())
if len(words) < k:
return frozenset(words)
return frozenset(" ".join(words[i : i + k]) for i in range(len(words) - k + 1))
def _stable_hash(token: str) -> int:
return int.from_bytes(hashlib.blake2b(token.encode("utf-8"), digest_size=8).digest(), "big")
class MinHasher:
"""A bank of ``num_perm`` universal hash permutations producing a fixed-length signature."""
def __init__(self, num_perm: int = 64, seed: int = 1) -> None:
self.num_perm = num_perm
rng = random.Random(seed)
self._coeffs = [
(rng.randrange(1, _PRIME), rng.randrange(0, _PRIME)) for _ in range(num_perm)
]
def signature(self, shingle_set: Iterable[str]) -> tuple[int, ...]:
"""The MinHash signature of ``shingle_set`` (empty set → all-``_PRIME`` sentinel)."""
hashed = [_stable_hash(s) for s in shingle_set]
if not hashed:
return tuple(_PRIME for _ in self._coeffs)
return tuple(min((a * h + b) % _PRIME for h in hashed) for a, b in self._coeffs)
def band_keys(
signature: tuple[int, ...], num_bands: int
) -> tuple[tuple[int, tuple[int, ...]], ...]:
"""Split a signature into ``num_bands`` band keys; two pages sharing one are LSH candidates."""
if num_bands <= 0 or len(signature) % num_bands != 0:
raise ValueError(f"signature length {len(signature)} not divisible into {num_bands} bands")
rows = len(signature) // num_bands
return tuple(
(b, signature[b * rows : (b + 1) * rows]) for b in range(num_bands)
)
def jaccard(a: frozenset[str], b: frozenset[str]) -> float:
"""Jaccard similarity of two shingle sets; two empty sets are defined as identical (1.0)."""
if not a and not b:
return 1.0
if not a or not b:
return 0.0
return len(a & b) / len(a | b)

View File

@@ -0,0 +1,91 @@
"""UnionIndex — the maintained derived tier wired behind resolution + views (SHARD-WP-0011 T4).
Wraps a :class:`UnionGraph` + decision log with an incrementally maintained
:class:`EquivalenceIndex`. Content equivalence is kept fresh by deltas (``note_change`` /
``note_removed``); curator bindings are re-synced live from the log fold. A full :meth:`rebuild`
is the bounded fallback. :meth:`verify` runs the I-2 consistency-checker over the live source.
Consumer-visible results are unchanged — equivalence groups are exposed in the same string form the
decision-log fold uses, a *superset* that additionally collapses genuine content duplicates — only
freshness and cost differ (recompute-on-read becomes change-driven).
"""
from __future__ import annotations
from shard_wiki.coordination import DecisionLog
from shard_wiki.incremental.equivalence import EquivalenceIndex
from shard_wiki.incremental.verification import (
ConsistencyChecker,
ConsistencyReport,
derived_digest,
)
from shard_wiki.model import Identity, Page
from shard_wiki.union import UnionGraph
__all__ = ["UnionIndex"]
def _identity(token: str) -> Identity:
shard, _, key = token.partition(":")
return Identity(shard, key)
class UnionIndex:
"""An incrementally maintained equivalence index over a union, with a rebuild fallback."""
def __init__(self, union: UnionGraph, log: DecisionLog, space: str) -> None:
self._union = union
self._log = log
self._space = space
self._eq = EquivalenceIndex()
self.rebuild()
def rebuild(self) -> None:
"""The bounded fallback: re-derive the whole index from current union pages + bindings."""
self._eq.build(self._union.iter_pages())
self._sync_curator()
def note_change(self, page: Page) -> None:
"""Change-driven update for one added/edited page (the operational path)."""
self._eq.update(page)
def note_removed(self, identity: Identity) -> None:
self._eq.remove(identity)
def _sync_curator(self) -> None:
"""Re-sync curator equivalence from the live decision-log fold (cheap, always correct)."""
groups = self._log.fold(self._space).equivalence_groups
edges: list[tuple[Identity, Identity]] = []
for group in groups:
members = [_identity(m) for m in group]
edges.extend((members[0], other) for other in members[1:])
self._eq.set_curator_edges(edges)
def equivalence_groups(self) -> tuple[frozenset[str], ...]:
"""Equivalence groups in decision-log string form (curator content), for the views."""
self._sync_curator()
return tuple(
frozenset(str(identity) for identity in group) for group in self._eq.groups()
)
def digest(self) -> str:
"""The Merkle-style digest of the maintained derived tier (I-2)."""
self._sync_curator()
return derived_digest(self._eq)
def verify(self) -> ConsistencyReport:
"""Check the maintained index against a from-scratch fold of the live source; self-heal."""
self._sync_curator()
checker = ConsistencyChecker(
self._eq,
pages=lambda: list(self._union.iter_pages()),
curator_edges=self._curator_pairs,
)
return checker.check_and_repair()
def _curator_pairs(self) -> list[tuple[Identity, Identity]]:
pairs: list[tuple[Identity, Identity]] = []
for group in self._log.fold(self._space).equivalence_groups:
members = [_identity(m) for m in group]
pairs.extend((members[0], other) for other in members[1:])
return pairs

View File

@@ -0,0 +1,112 @@
"""I-2 verification — digest + background consistency-checker (SHARD-WP-0011 T3).
``derived = f(canonical)`` is made *verified*, not asserted. A **Merkle-style digest** summarizes
the derived tier (each identity's content fingerprint + its incident equivalence edges as a leaf,
order-independently combined into a root) so two derived states are equal iff their digests match.
A **consistency-checker** recomputes the authoritative fold from the current source, compares it to
the maintained index over a (sampled) region, and on mismatch performs a **scoped recompute** of
just the affected identities — self-healing drift from a missed delta or corrupted state.
The digest is a pure function of index state, so it is "maintained alongside deltas" for free and
is stable under equivalent event orders (leaves are sorted before combination).
"""
from __future__ import annotations
import hashlib
from collections.abc import Callable, Iterable
from dataclasses import dataclass
from shard_wiki.incremental.equivalence import EquivalenceIndex
from shard_wiki.model import Identity, Page
__all__ = ["region_digest", "derived_digest", "ConsistencyReport", "ConsistencyChecker"]
CuratorEdges = Iterable[tuple[Identity, Identity]]
def _leaf(index: EquivalenceIndex, identity: Identity) -> str:
"""A digest leaf for one identity: its fingerprint + its incident edges (as sorted peers)."""
fingerprint = index.fingerprint(identity) or ""
peers = sorted(
str(other)
for edge in index.edges()
if identity in edge
for other in edge
if other != identity
)
payload = f"{identity}|{fingerprint}|{','.join(peers)}"
return hashlib.blake2b(payload.encode("utf-8"), digest_size=16).hexdigest()
def region_digest(index: EquivalenceIndex, identities: Iterable[Identity]) -> str:
"""A Merkle-style root over the given identities' leaves (order-independent)."""
leaves = sorted(_leaf(index, identity) for identity in identities)
root = hashlib.blake2b(digest_size=16)
for leaf in leaves:
root.update(leaf.encode("utf-8"))
return root.hexdigest()
def derived_digest(index: EquivalenceIndex) -> str:
"""The digest of the whole maintained derived tier."""
return region_digest(index, index.identities())
@dataclass(frozen=True, slots=True)
class ConsistencyReport:
"""Outcome of a consistency check: what was examined, whether it drifted, and if it healed."""
checked: int
drifted: bool
repaired: bool
healthy: bool
class ConsistencyChecker:
"""Compares the maintained index against an authoritative rebuild and repairs drift in place."""
def __init__(
self,
index: EquivalenceIndex,
pages: Callable[[], Iterable[Page]],
curator_edges: Callable[[], CuratorEdges] = lambda: (),
) -> None:
self._index = index
self._pages = pages
self._curator = curator_edges
def _authoritative(self) -> EquivalenceIndex:
expected = EquivalenceIndex(
num_bands=self._index.num_bands, threshold=self._index.threshold
)
expected.build(list(self._pages()), list(self._curator()))
return expected
def check_and_repair(self, sample: Iterable[Identity] | None = None) -> ConsistencyReport:
"""Verify the (sampled) region against a from-scratch fold; scoped-recompute on mismatch."""
source = {p.identity: p for p in self._pages()}
expected = self._authoritative()
region = (
set(sample)
if sample is not None
else set(source) | set(self._index.identities())
)
drifted = region_digest(self._index, region) != region_digest(expected, region)
if not drifted:
return ConsistencyReport(len(region), drifted=False, repaired=False, healthy=True)
self._repair(region, source)
healthy = region_digest(self._index, region) == region_digest(expected, region)
return ConsistencyReport(len(region), drifted=True, repaired=True, healthy=healthy)
def _repair(self, region: set[Identity], source: dict[Identity, Page]) -> None:
"""Scoped recompute: reconcile each affected identity to the current source."""
present = self._index.identities()
for identity in region:
page = source.get(identity)
if page is not None:
self._index.update(page) if identity in present else self._index.add(page)
elif identity in present:
self._index.remove(identity)

View File

@@ -20,9 +20,20 @@ from shard_wiki.coordination import (
Overlay,
OverlayEngine,
)
from shard_wiki.incremental import ConsistencyReport, UnionIndex
from shard_wiki.model import Page
from shard_wiki.policy import DEFAULT_POLICY, Policy
from shard_wiki.union import Resolution, UnionGraph
from shard_wiki.views import (
AllPagesEntry,
BackLink,
ChangeEntry,
SiteMapNode,
all_pages,
build_backlinks,
recent_changes,
site_map,
)
__all__ = ["InformationSpace"]
@@ -41,6 +52,8 @@ class InformationSpace:
self.log = DecisionLog(store)
self.union = UnionGraph(space_id, log=self.log, policy=policy)
self.overlays = OverlayEngine(space_id, self.log)
self._index: UnionIndex | None = None # maintained derived tier, built lazily
self._index_stale = True
@classmethod
def git_backed(
@@ -57,6 +70,7 @@ class InformationSpace:
"""Attach a shard — only if it passes conformance (verified profile, I-3/§6.6)."""
assert_conformant(adapter)
self.union.attach(adapter)
self._index_stale = True
def alias(self, name: str, target: str, actor: str | None = None) -> None:
"""Record a coordination-canonical alias (``name`` → ``"shard:key"``) in the log."""
@@ -91,4 +105,44 @@ class InformationSpace:
write-through-capable target fast-forwards (write-through); a read-only target keeps the
draft as local truth (I-5: overlay before mutation, always)."""
overlay = self.overlay(name, body, actor=actor)
return self.apply_overlay(overlay.overlay_id)
result = self.apply_overlay(overlay.overlay_id)
self._index_stale = True # the applied edit changes the derived tier
return result
# --- maintained derived tier (SHARD-WP-0011): incremental-first, rebuild as fallback ---
@property
def index(self) -> UnionIndex:
"""The maintained equivalence index (built lazily; rebuilt when the union has changed)."""
if self._index is None:
self._index = UnionIndex(self.union, self.log, self.space_id)
elif self._index_stale:
self._index.rebuild() # bounded fallback after a mutation
self._index_stale = False
return self._index
def reindex(self) -> None:
"""Force a full rebuild of the maintained derived tier (the explicit fallback path)."""
self.index.rebuild()
def verify_index(self) -> ConsistencyReport:
"""Run the I-2 consistency-checker over the maintained tier; self-heal any drift."""
return self.index.verify()
# --- derived views (SHARD-WP-0010): recomputable, provenance-carrying, presentation-free ---
def backlinks(self, name: str, *, camelcase: bool = False) -> tuple[BackLink, ...]:
"""Pages across the union that link to ``name`` (UC-18)."""
return build_backlinks(self.union, camelcase=camelcase).to(name)
def recent_changes(self, *, limit: int | None = None) -> tuple[ChangeEntry, ...]:
"""The merged newest-first change feed: coordination journal + shard signals (UC-17)."""
return recent_changes(self.union, self.log, self.space_id, limit=limit)
def all_pages(self) -> tuple[AllPagesEntry, ...]:
"""The union's distinct pages, collapsed via the maintained equivalence index."""
return all_pages(self.union, equivalence_groups=self.index.equivalence_groups())
def site_map(self) -> SiteMapNode:
"""The union namespace tree built from page placements."""
return site_map(self.union)

View File

@@ -13,6 +13,7 @@ imported by nothing.
from __future__ import annotations
import dataclasses
from collections.abc import Iterator
from dataclasses import dataclass
from enum import Enum
@@ -68,6 +69,20 @@ class UnionGraph:
def shard(self, shard_id: str) -> ShardAdapter | None:
return next((s for s in self._shards if s.shard_id == shard_id), None)
@property
def shards(self) -> tuple[ShardAdapter, ...]:
return tuple(self._shards)
def iter_pages(self) -> Iterator[Page]:
"""Every page across attached shards, raw (per-shard, not chorus-collapsed). The
enumeration substrate for derived views — BackLinks, AllPages, SiteMap (§8.4)."""
for shard in self._shards:
for key in shard.keys():
try:
yield shard.read(key)
except KeyError:
continue
def _read_all(self, key: str) -> list[Page]:
pages: list[Page] = []
for shard in self._shards:

View File

@@ -0,0 +1,33 @@
"""views/ — derived, recomputable, provenance-carrying read views over the union (§8.4).
All views here are *derived tier*: pure functions of the attached shards plus the coordination-log
fold, storing nothing canonical (SHARD-WP-0011 makes them incrementally maintainable). Presentation
stays out of core (L6) — these produce models, never rendered output. Per the dependency rule this
package imports down (union/model/coordination/provenance) and is imported only by the orchestrator.
"""
from shard_wiki.views.allpages import AllPagesEntry, SiteMapNode, all_pages, site_map
from shard_wiki.views.backlinks import BackLink, BackLinksIndex, build_backlinks
from shard_wiki.views.links import (
ResolvedLink,
WikiLink,
extract_links,
resolve_links,
)
from shard_wiki.views.recentchanges import ChangeEntry, recent_changes
__all__ = [
"WikiLink",
"ResolvedLink",
"extract_links",
"resolve_links",
"BackLink",
"BackLinksIndex",
"build_backlinks",
"ChangeEntry",
"recent_changes",
"AllPagesEntry",
"SiteMapNode",
"all_pages",
"site_map",
]

View File

@@ -0,0 +1,131 @@
"""AllPages + SiteMap — enumeration views over the union (SHARD-WP-0010 T4).
**AllPages** lists the union's distinct pages, collapsing identities that name the same page: a
*chorus* (same key across shards) and *equivalence-bound* identities (decision-log bindings) fold
into one entry, with divergence noted when the members' bodies differ (union without erasure — the
collapse is acknowledged, never silent). **SiteMap** is the namespace tree built from page
placements (paths), spanning shards.
Both are derived/recomputable and presentation-free (the tree is a model, not rendered HTML).
"""
from __future__ import annotations
from dataclasses import dataclass
from shard_wiki.model import Identity, Page
from shard_wiki.union import UnionGraph
__all__ = ["AllPagesEntry", "SiteMapNode", "all_pages", "site_map"]
@dataclass(frozen=True, slots=True)
class AllPagesEntry:
"""One union page: its representative ``name``, the ``members`` collapsed into it, and whether
those members' bodies ``diverge`` (a chorus with differing content)."""
name: str
members: tuple[Identity, ...]
diverges: bool
@dataclass(frozen=True, slots=True)
class SiteMapNode:
"""A namespace node: its path ``name``, child namespaces, and pages directly under it."""
name: str
children: tuple[SiteMapNode, ...]
pages: tuple[Identity, ...]
class _UnionFind:
def __init__(self) -> None:
self._parent: dict[str, str] = {}
def add(self, x: str) -> None:
self._parent.setdefault(x, x)
def find(self, x: str) -> str:
self.add(x)
root = x
while self._parent[root] != root:
root = self._parent[root]
while self._parent[x] != root:
self._parent[x], x = root, self._parent[x]
return root
def union(self, a: str, b: str) -> None:
self.add(a)
self.add(b)
ra, rb = self.find(a), self.find(b)
if ra != rb:
self._parent[max(ra, rb)] = min(ra, rb)
def all_pages(
union: UnionGraph,
equivalence_groups: tuple[frozenset[str], ...] | None = None,
) -> tuple[AllPagesEntry, ...]:
"""Enumerate the union's distinct pages, collapsing chorus + equivalence-bound members.
``equivalence_groups`` (string identities, decision-log form) overrides the source of
equivalence — the orchestrator passes the maintained index's groups (SHARD-WP-0011 T4); the
default falls back to the decision-log fold, so direct callers are unaffected.
"""
pages: dict[str, Page] = {}
by_key: dict[str, list[str]] = {}
for page in union.iter_pages():
ident = str(page.identity)
pages[ident] = page
by_key.setdefault(page.identity.key, []).append(ident)
uf = _UnionFind()
for ident in pages:
uf.add(ident)
for idents in by_key.values(): # same key across shards → chorus
for other in idents[1:]:
uf.union(idents[0], other)
if equivalence_groups is None:
equivalence_groups = union.log.fold(union.space).equivalence_groups
for group in equivalence_groups: # curator bindings (+ maintained content edges)
present = [m for m in group if m in pages]
for other in present[1:]:
uf.union(present[0], other)
groups: dict[str, list[str]] = {}
for ident in pages:
groups.setdefault(uf.find(ident), []).append(ident)
entries: list[AllPagesEntry] = []
for members in groups.values():
member_pages = [pages[m] for m in members]
identities = tuple(p.identity for p in member_pages)
name = min(p.identity.key for p in member_pages)
diverges = len({p.body for p in member_pages}) > 1
entries.append(AllPagesEntry(name=name, members=identities, diverges=diverges))
return tuple(sorted(entries, key=lambda e: e.name))
def _segments(page: Page) -> list[str]:
path = page.placements[0].path if page.placements else page.identity.key
if path.endswith(".md"):
path = path[:-3]
return [seg for seg in path.split("/") if seg]
def site_map(union: UnionGraph) -> SiteMapNode:
"""The union namespace tree from page placements (directories nest; pages sit at their dir)."""
root: dict = {"children": {}, "pages": []}
for page in union.iter_pages():
segments = _segments(page)
node = root
for seg in segments[:-1]: # directory segments build the nesting
node = node["children"].setdefault(seg, {"children": {}, "pages": []})
node["pages"].append(page.identity)
return _freeze("", root)
def _freeze(name: str, node: dict) -> SiteMapNode:
children = tuple(_freeze(k, v) for k, v in sorted(node["children"].items()))
pages = tuple(sorted(node["pages"], key=str))
return SiteMapNode(name=name, children=children, pages=pages)

View File

@@ -0,0 +1,65 @@
"""BackLinks — the strongest core derived view (SHARD-WP-0010 T2; UC-18).
For any page name, the set of pages that link to it. Built by extracting wikilinks (T1) from every
page across the attached shards and resolving each through the union: only **resolved** links
create a backlink (a red-link points at nothing, so it contributes none). Entries carry their
**source provenance** (the linking page's identity / shard). Keying by the resolved *name* means a
chorus target aggregates the backlinks of all its members into one bucket (union without erasure).
Derived/recomputable — stores nothing canonical; SHARD-WP-0011 maintains it incrementally.
"""
from __future__ import annotations
from collections.abc import Mapping
from dataclasses import dataclass
from shard_wiki.model import Identity
from shard_wiki.union import UnionGraph
from shard_wiki.views.links import resolve_links
__all__ = ["BackLink", "BackLinksIndex", "build_backlinks"]
@dataclass(frozen=True, slots=True)
class BackLink:
"""One inbound link: ``source`` (the linking page) references ``target_name``."""
source: Identity
target_name: str
@property
def source_shard(self) -> str:
return self.source.shard
class BackLinksIndex:
"""An immutable name → inbound-links index over the union link graph."""
def __init__(self, edges: Mapping[str, tuple[BackLink, ...]]) -> None:
self._edges = dict(edges)
def to(self, name: str) -> tuple[BackLink, ...]:
"""The backlinks pointing at ``name`` (empty if none)."""
return self._edges.get(name, ())
def sources(self, name: str) -> frozenset[Identity]:
"""Just the identities linking to ``name`` — convenient for set assertions."""
return frozenset(bl.source for bl in self.to(name))
def names(self) -> frozenset[str]:
return frozenset(self._edges)
def build_backlinks(union: UnionGraph, *, camelcase: bool = False) -> BackLinksIndex:
"""Scan every union page's links and index the resolved ones by target name."""
edges: dict[str, set[BackLink]] = {}
for page in union.iter_pages():
for resolved in resolve_links(union, page.body, camelcase=camelcase):
if resolved.is_red_link:
continue # red-links don't create backlinks
backlink = BackLink(source=page.identity, target_name=resolved.link.target)
edges.setdefault(resolved.link.target, set()).add(backlink)
return BackLinksIndex(
{name: tuple(sorted(links, key=lambda bl: str(bl.source))) for name, links in edges.items()}
)

View File

@@ -0,0 +1,91 @@
"""Wikilink + red-link model (SHARD-WP-0010 T1; FederationRequirements ADR-06).
A CommonMark *wikilink extension*: ``[[Target]]`` and ``[[Target|label]]`` are extracted from a
page body and each target is resolved through the union (ADR-01). A target that resolves is a
**link**; one that does not is a **red-link** — a createable hole (UC-23), never a dropped
reference (union without erasure). CamelCase auto-linking (``WikiWord``) is **off by default** and
opt-in per space, since bare CamelCase is noisy and policy-laden.
The link *model and resolution* are core; turning a :class:`ResolvedLink` into an ``<a>`` (or a
red anchor) is L6 presentation and lives outside this package. Link spans are byte/char offsets in
the body so a later layer can address them precisely.
"""
from __future__ import annotations
import re
from dataclasses import dataclass
from shard_wiki.union import Resolution, UnionGraph
__all__ = ["WikiLink", "ResolvedLink", "extract_links", "resolve_links"]
_WIKILINK_RE = re.compile(r"\[\[\s*([^\]|]+?)\s*(?:\|\s*([^\]]+?)\s*)?\]\]")
# A WikiWord: ≥2 capitalized alphanumeric segments run together (e.g. FrontPage, WikiWord).
_CAMELCASE_RE = re.compile(r"\b([A-Z][a-z0-9]+(?:[A-Z][a-z0-9]+)+)\b")
_FENCED_RE = re.compile(r"```.*?```", re.DOTALL)
_INLINE_CODE_RE = re.compile(r"`[^`\n]*`")
@dataclass(frozen=True, slots=True)
class WikiLink:
"""One extracted reference. ``target`` is the resolve key; ``label`` is the display text (or
None to use the target); ``span`` is the ``[start, end)`` offset of the whole token in the body;
``auto`` marks a CamelCase auto-link (vs an explicit ``[[...]]``)."""
target: str
label: str | None
span: tuple[int, int]
auto: bool = False
@property
def text(self) -> str:
return self.label or self.target
@dataclass(frozen=True, slots=True)
class ResolvedLink:
"""A :class:`WikiLink` paired with its union :class:`Resolution` (the link's truth status)."""
link: WikiLink
resolution: Resolution
@property
def is_red_link(self) -> bool:
return self.resolution.is_red_link
def _mask(body: str, pattern: re.Pattern[str]) -> str:
"""Blank out ``pattern`` matches with equal-length spaces so later scans skip them while every
surviving match keeps its true offset."""
return pattern.sub(lambda m: " " * len(m.group(0)), body)
def extract_links(body: str, *, camelcase: bool = False) -> tuple[WikiLink, ...]:
"""Extract wikilinks from ``body`` in document order, skipping fenced/inline code.
With ``camelcase=True`` (per-space opt-in), bare ``WikiWord`` tokens outside code and outside
existing ``[[...]]`` also become links.
"""
scan = _mask(_mask(body, _FENCED_RE), _INLINE_CODE_RE)
links: list[WikiLink] = []
for m in _WIKILINK_RE.finditer(scan):
links.append(WikiLink(target=m.group(1).strip(), label=m.group(2), span=m.span()))
if camelcase:
# Mask explicit-link spans too, so a CamelCase target inside [[...]] isn't double-counted.
cc_scan = _mask(scan, _WIKILINK_RE)
for m in _CAMELCASE_RE.finditer(cc_scan):
links.append(WikiLink(target=m.group(1), label=None, span=m.span(), auto=True))
return tuple(sorted(links, key=lambda link: link.span[0]))
def resolve_links(
union: UnionGraph, body: str, *, camelcase: bool = False
) -> tuple[ResolvedLink, ...]:
"""Extract and resolve every link in ``body`` against ``union`` (link vs red-link, ADR-01)."""
return tuple(
ResolvedLink(link, union.resolve(link.target))
for link in extract_links(body, camelcase=camelcase)
)

View File

@@ -0,0 +1,108 @@
"""RecentChanges — a merged change feed over the union (SHARD-WP-0010 T3; UC-17).
Two streams, one ordered feed (newest-first):
* the **coordination journal** — overlay/alias/fork/merge/binding decisions from the decision log,
each carrying its actor and the decision payload; and
* **shard change signals** — a page's current revision (folder mtime / ``source_rev``), i.e. the
backend's own "this changed" evidence.
Every entry carries provenance: which shard the edit came from, or that it was a coordination
decision (and by whom). Derived/recomputable — `notify`-driven streaming is a later binding.
"""
from __future__ import annotations
from collections.abc import Mapping
from dataclasses import dataclass, field
from datetime import datetime
from shard_wiki.coordination import DecisionLog, EventType
from shard_wiki.union import UnionGraph
__all__ = ["ChangeEntry", "recent_changes"]
_COORDINATION = "coordination"
# How each journal event names the thing it touched + a human kind label.
_EVENT_KIND = {
EventType.ALIAS_SET: "alias",
EventType.OVERLAY_CREATED: "overlay",
EventType.MERGE_DECIDED: "merge",
EventType.PAGE_FORKED: "fork",
EventType.BINDING_MADE: "binding",
}
@dataclass(frozen=True, slots=True)
class ChangeEntry:
"""One change in the feed. ``source`` is the shard id (a shard edit) or ``"coordination"``."""
when: datetime
kind: str
ref: str
source: str
actor: str | None = None
detail: Mapping[str, object] = field(default_factory=dict)
def _event_ref(event_type: EventType, payload: Mapping[str, object]) -> str:
if event_type is EventType.ALIAS_SET:
return str(payload.get("alias", ""))
if event_type is EventType.OVERLAY_CREATED:
return f"{payload.get('target_shard')}:{payload.get('target_key')}"
if event_type is EventType.PAGE_FORKED:
return f"{payload.get('source')}{payload.get('fork')}"
if event_type is EventType.BINDING_MADE:
return ", ".join(str(m) for m in payload.get("members", ()))
return str(payload.get("overlay_id", "")) # MERGE_DECIDED
def recent_changes(
union: UnionGraph,
log: DecisionLog,
space: str,
*,
limit: int | None = None,
) -> tuple[ChangeEntry, ...]:
"""Merge the coordination journal and shard change signals into one newest-first feed."""
entries: list[ChangeEntry] = []
for event in log.events(space):
entries.append(
ChangeEntry(
when=event.timestamp,
kind=_EVENT_KIND.get(event.type, event.type.value),
ref=_event_ref(event.type, event.payload),
source=_COORDINATION,
actor=event.actor,
detail=dict(event.payload),
)
)
for page in union.iter_pages():
rev = page.envelope.source_rev
when = _parse_rev(rev)
if when is None:
continue # shard offers no change signal for this page — skip gracefully
entries.append(
ChangeEntry(
when=when,
kind="edit",
ref=str(page.identity),
source=page.identity.shard,
detail={"source_rev": rev},
)
)
entries.sort(key=lambda e: e.when, reverse=True)
return tuple(entries if limit is None else entries[:limit])
def _parse_rev(rev: str | None) -> datetime | None:
if rev is None:
return None
try:
return datetime.fromisoformat(rev)
except ValueError:
return None # non-temporal revision token (e.g. a content hash) — no feed timestamp

131
tests/test_git_adapter.py Normal file
View File

@@ -0,0 +1,131 @@
"""Tests for the GitShardAdapter read path + profile (SHARD-WP-0012 T1)."""
import subprocess
import pytest
from shard_wiki.adapters import GitShardAdapter, run_conformance
from shard_wiki.model import (
AttachmentMode,
History,
NotSupported,
ProfileError,
Substrate,
Verb,
)
def _git(repo, *args):
subprocess.run(
["git", "-C", str(repo), *args],
check=True,
capture_output=True,
env={"GIT_AUTHOR_NAME": "t", "GIT_AUTHOR_EMAIL": "t@t",
"GIT_COMMITTER_NAME": "t", "GIT_COMMITTER_EMAIL": "t@t",
"PATH": __import__("os").environ.get("PATH", "")},
)
def _repo(tmp_path, files, name="repo"):
repo = tmp_path / name
repo.mkdir()
_git(repo, "init", "--quiet")
for rel, text in files.items():
p = repo / rel
p.parent.mkdir(parents=True, exist_ok=True)
p.write_text(text, encoding="utf-8")
_git(repo, "add", rel)
_git(repo, "commit", "-m", "seed")
return repo
def test_keys_are_tracked_md_paths(tmp_path):
repo = _repo(tmp_path, {"Home.md": "h", "docs/Guide.md": "g", "ignore.txt": "x"})
adapter = GitShardAdapter("git", repo)
assert set(adapter.keys()) == {"Home", "docs/Guide"} # only tracked *.md
def test_read_returns_page_with_commit_sha_rev(tmp_path):
repo = _repo(tmp_path, {"Home.md": "welcome"})
adapter = GitShardAdapter("git", repo)
page = adapter.read("Home")
assert page.identity.shard == "git"
assert page.body == "welcome"
head = subprocess.run(
["git", "-C", str(repo), "rev-parse", "HEAD"], capture_output=True, text=True, check=True
).stdout.strip()
assert page.envelope.source_rev == head # source_rev is the commit sha
assert page.envelope.lineage == "git-native"
def test_read_missing_key_raises(tmp_path):
adapter = GitShardAdapter("git", _repo(tmp_path, {"Home.md": "h"}))
with pytest.raises(KeyError):
adapter.read("Nope")
def test_profile_validates_implication_rules(tmp_path):
profile = GitShardAdapter("git", _repo(tmp_path, {"Home.md": "h"})).profile()
assert profile.substrate is Substrate.GIT
assert profile.attachment_mode is AttachmentMode.GIT_IS_STORE
assert profile.history is History.GIT_NATIVE # git-is-store ⟹ git-native
profile.validate() # raises if the implication rule were violated
def test_profile_is_read_only_in_t1(tmp_path):
profile = GitShardAdapter("git", _repo(tmp_path, {"Home.md": "h"})).profile()
assert profile.supports(Verb.READ)
assert not profile.supports(Verb.WRITE)
def test_conformance_read_path_passes(tmp_path):
adapter = GitShardAdapter("git", _repo(tmp_path, {"Home.md": "h", "Other.md": "o"}))
report = run_conformance(adapter)
assert report.ok, report.diff()
def test_unclaimed_write_raises_not_supported(tmp_path):
adapter = GitShardAdapter("git", _repo(tmp_path, {"Home.md": "h"}))
with pytest.raises(NotSupported):
adapter.write("Home", "new") # read-only: honest absence
def test_empty_repo_has_no_keys(tmp_path):
repo = tmp_path / "empty"
repo.mkdir()
_git(repo, "init", "--quiet")
adapter = GitShardAdapter("git", repo)
assert list(adapter.keys()) == []
def test_bad_profile_combo_is_rejected():
# Sanity: the implication rule that backs the git profile actually bites when violated.
from shard_wiki.model import (
AccessGrant,
Addressing,
CapabilityProfile,
ContentOpacity,
MergeModel,
NativeQuery,
OperationalEnvelope,
Translation,
WriteGranularity,
)
from shard_wiki.provenance import Liveness
with pytest.raises(ProfileError):
CapabilityProfile(
substrate=Substrate.FILES, # not git, but claims git-is-store
attachment_mode=AttachmentMode.GIT_IS_STORE,
write_granularity=WriteGranularity.NONE,
content_opacity=ContentOpacity.TRANSPARENT,
operational_envelope=OperationalEnvelope.LOCAL_UNBOUNDED,
access_grant=AccessGrant.OPEN,
liveness=Liveness.STATIC,
history=History.NONE,
merge_model=MergeModel.NONE,
addressing=Addressing.PATH,
native_query=NativeQuery.NONE,
translation=Translation.NATIVE,
supported_verbs=frozenset({Verb.READ}),
).validate()

View File

@@ -0,0 +1,116 @@
"""GitShardAdapter history adopt + cross-substrate integration (SHARD-WP-0012 T3)."""
import os
import subprocess
import pytest
from shard_wiki.adapters import FolderAdapter, GitShardAdapter
from shard_wiki.coordination import ApplyStatus
from shard_wiki.space import InformationSpace
_ENV = {
"GIT_AUTHOR_NAME": "t", "GIT_AUTHOR_EMAIL": "t@t",
"GIT_COMMITTER_NAME": "t", "GIT_COMMITTER_EMAIL": "t@t",
"PATH": os.environ.get("PATH", ""),
}
def _git(repo, *args):
return subprocess.run(
["git", "-C", str(repo), *args], check=True, capture_output=True, text=True, env=_ENV
).stdout.strip()
def _git_repo(tmp_path, files, name="git"):
repo = tmp_path / name
repo.mkdir()
_git(repo, "init", "--quiet")
for rel, text in files.items():
(repo / rel).parent.mkdir(parents=True, exist_ok=True)
(repo / rel).write_text(text, encoding="utf-8")
_git(repo, "add", rel)
_git(repo, "commit", "-m", "seed")
return repo
def _folder(tmp_path, name, files, writable=False):
root = tmp_path / name
for rel, text in files.items():
p = root / rel
p.parent.mkdir(parents=True, exist_ok=True)
p.write_text(text, encoding="utf-8")
return FolderAdapter(name, root, writable=writable)
# -- history adopt -------------------------------------------------------------
def test_history_lists_commits_newest_first(tmp_path):
repo = _git_repo(tmp_path, {"Home.md": "v1"})
adapter = GitShardAdapter("git", repo, writable=True)
adapter.write("Home", "v2")
history = adapter.history("Home")
assert len(history) == 2
assert history[0].message == "write Home.md" # newest first
assert history[-1].message == "seed"
assert all(rev.sha for rev in history)
def test_history_unknown_key_raises(tmp_path):
adapter = GitShardAdapter("git", _git_repo(tmp_path, {"Home.md": "h"}))
with pytest.raises(KeyError):
adapter.history("Nope")
# -- cross-substrate integration ----------------------------------------------
def test_resolve_across_git_and_folder(tmp_path):
space = InformationSpace("space")
space.attach(GitShardAdapter("git", _git_repo(tmp_path, {"Home.md": "git home"})))
space.attach(_folder(tmp_path, "notes", {"Daily.md": "folder daily"}))
assert space.read("Home").body == "git home" # resolved from the git shard
assert space.read("Daily").body == "folder daily" # resolved from the folder shard
def test_chorus_spans_substrates_with_divergence(tmp_path):
space = InformationSpace("space")
space.attach(GitShardAdapter("git", _git_repo(tmp_path, {"Shared.md": "from git"})))
space.attach(_folder(tmp_path, "notes", {"Shared.md": "from folder"}))
res = space.resolve("Shared")
assert {p.body for p in res.pages} == {"from git", "from folder"} # chorus across substrates
git_page = next(p for p in res.pages if p.identity.shard == "git")
assert git_page.envelope.divergence # divergence recorded, not erased
def test_edit_through_git_shard_commits(tmp_path):
repo = _git_repo(tmp_path, {"Home.md": "original"})
space = InformationSpace("space")
space.attach(GitShardAdapter("git", repo, writable=True))
result = space.edit("Home", "edited via overlay")
assert result.status is ApplyStatus.APPLIED # write-through fast-forward on a git shard
assert space.read("Home").body == "edited via overlay"
assert int(_git(repo, "rev-list", "--count", "HEAD")) == 2 # the edit became a commit
def test_apply_under_drift_refuses_on_external_commit(tmp_path):
repo = _git_repo(tmp_path, {"Home.md": "original"})
space = InformationSpace("space")
space.attach(GitShardAdapter("git", repo, writable=True))
overlay = space.overlay("Home", "my draft") # base_rev = current git sha
# Another writer commits to the same path → the sha moves underneath the draft.
(repo / "Home.md").write_text("someone else", encoding="utf-8")
_git(repo, "add", "Home.md")
_git(repo, "commit", "-m", "external")
result = space.apply_overlay(overlay.overlay_id)
assert result.status is ApplyStatus.REFUSED_DRIFT # never clobber (sha drift detected)
# The shard itself is untouched — the external commit stands; the draft remains a draft.
assert space.union.shard("git").read("Home").body == "someone else"
def test_overlay_on_read_only_git_shard_kept_as_draft(tmp_path):
space = InformationSpace("space")
space.attach(GitShardAdapter("git", _git_repo(tmp_path, {"Home.md": "ro"}), writable=False))
result = space.edit("Home", "wanted change")
assert result.status is ApplyStatus.KEPT_DRAFT # read-only target → overlay retained

View File

@@ -0,0 +1,89 @@
"""Tests for GitShardAdapter write=commit + current_rev drift (SHARD-WP-0012 T2)."""
import os
import subprocess
from shard_wiki.adapters import GitShardAdapter, run_conformance
from shard_wiki.model import Verb
_ENV = {
"GIT_AUTHOR_NAME": "t", "GIT_AUTHOR_EMAIL": "t@t",
"GIT_COMMITTER_NAME": "t", "GIT_COMMITTER_EMAIL": "t@t",
"PATH": os.environ.get("PATH", ""),
}
def _git(repo, *args, capture=False):
return subprocess.run(
["git", "-C", str(repo), *args], check=True, capture_output=True, text=True, env=_ENV
).stdout.strip()
def _repo(tmp_path, files):
repo = tmp_path / "repo"
repo.mkdir()
_git(repo, "init", "--quiet")
for rel, text in files.items():
(repo / rel).write_text(text, encoding="utf-8")
_git(repo, "add", rel)
_git(repo, "commit", "-m", "seed")
return repo
def test_writable_profile_declares_write_and_version(tmp_path):
profile = GitShardAdapter("git", _repo(tmp_path, {"Home.md": "h"}), writable=True).profile()
assert profile.supports(Verb.WRITE)
assert profile.supports(Verb.VERSION)
profile.validate() # PER_PAGE + WRITE is a consistent combination
def test_write_creates_a_commit(tmp_path):
repo = _repo(tmp_path, {"Home.md": "old"})
adapter = GitShardAdapter("git", repo, writable=True)
before = _git(repo, "rev-list", "--count", "HEAD")
page = adapter.write("Home", "new body")
after = _git(repo, "rev-list", "--count", "HEAD")
assert int(after) == int(before) + 1 # one new commit
assert page.body == "new body"
assert page.envelope.source_rev == _git(repo, "rev-parse", "HEAD") # page is at the new sha
def test_write_advances_current_rev(tmp_path):
repo = _repo(tmp_path, {"Home.md": "old"})
adapter = GitShardAdapter("git", repo, writable=True)
rev_before = adapter.current_rev("Home")
adapter.write("Home", "changed")
assert adapter.current_rev("Home") != rev_before # sha moved → drift detectable
def test_write_new_key_tracks_it(tmp_path):
repo = _repo(tmp_path, {"Home.md": "h"})
adapter = GitShardAdapter("git", repo, writable=True)
adapter.write("docs/New", "fresh page")
assert "docs/New" in set(adapter.keys())
assert adapter.read("docs/New").body == "fresh page"
def test_noop_write_creates_no_empty_commit(tmp_path):
repo = _repo(tmp_path, {"Home.md": "same"})
adapter = GitShardAdapter("git", repo, writable=True)
before = _git(repo, "rev-list", "--count", "HEAD")
adapter.write("Home", "same") # identical body → nothing to commit
assert _git(repo, "rev-list", "--count", "HEAD") == before
def test_current_rev_reflects_external_commit(tmp_path):
repo = _repo(tmp_path, {"Home.md": "h"})
adapter = GitShardAdapter("git", repo, writable=True)
rev = adapter.current_rev("Home")
# An out-of-band commit to the same path (another writer) moves the per-path sha.
(repo / "Home.md").write_text("externally edited", encoding="utf-8")
_git(repo, "add", "Home.md")
_git(repo, "commit", "-m", "external")
assert adapter.current_rev("Home") != rev
def test_conformance_positive_write_probe_passes(tmp_path):
adapter = GitShardAdapter("git", _repo(tmp_path, {"Home.md": "body"}), writable=True)
report = run_conformance(adapter)
assert report.ok, report.diff()

View File

@@ -0,0 +1,89 @@
"""Tests for the indexed equivalence relation — blocking + verify (SHARD-WP-0011 T1)."""
from itertools import combinations
from shard_wiki.incremental import EquivalenceIndex, MinHasher, band_keys, jaccard, shingles
from shard_wiki.incremental.equivalence import _fingerprint
from shard_wiki.model import Identity, Page
from shard_wiki.provenance import ProvenanceEnvelope
def _page(shard, key, body):
return Page(
identity=Identity(shard, key),
body=body,
envelope=ProvenanceEnvelope(source_shard=shard),
)
def _brute_force_groups(pages, threshold):
"""Oracle: O(N²) verify of every pair, then connected components."""
parent = {p.identity: p.identity for p in pages}
def find(x):
while parent[x] != x:
parent[x] = parent[parent[x]]
x = parent[x]
return x
for p, q in combinations(pages, 2):
same_fp = _fingerprint(p.body) == _fingerprint(q.body)
sim = jaccard(shingles(p.body), shingles(q.body))
if same_fp or sim >= threshold:
parent[find(p.identity)] = find(q.identity)
comps = {}
for p in pages:
comps.setdefault(find(p.identity), set()).add(p.identity)
return {frozenset(v) for v in comps.values() if len(v) > 1}
def test_minhash_lsh_buckets_near_duplicates_together():
hasher = MinHasher(num_perm=64)
base = "the quick brown fox jumps over the lazy dog near the river bank today"
near = base + " and then some"
far = "completely unrelated content about astrophysics and distant galaxies far"
b_base = set(band_keys(hasher.signature(shingles(base)), 32))
b_near = set(band_keys(hasher.signature(shingles(near)), 32))
b_far = set(band_keys(hasher.signature(shingles(far)), 32))
assert b_base & b_near # near-duplicates share at least one band
assert not (b_base & b_far) # unrelated pages do not
def test_exact_duplicate_across_shards_is_equivalent():
idx = EquivalenceIndex()
idx.add(_page("A", "Foo", "identical body text here"))
idx.add(_page("B", "Bar", "identical body text here"))
assert idx.equivalent_to(Identity("A", "Foo")) == frozenset(
{Identity("A", "Foo"), Identity("B", "Bar")}
)
def test_unrelated_pages_are_not_equivalent():
idx = EquivalenceIndex()
idx.add(_page("A", "Foo", "alpha beta gamma delta epsilon"))
idx.add(_page("B", "Bar", "nothing in common whatsoever entirely"))
assert idx.groups() == ()
def test_curator_binding_forces_equivalence_regardless_of_content():
idx = EquivalenceIndex()
idx.add(_page("A", "Foo", "one thing"))
idx.add(_page("B", "Bar", "totally different"))
idx.bind(Identity("A", "Foo"), Identity("B", "Bar"))
assert idx.equivalent_to(Identity("A", "Foo")) == frozenset(
{Identity("A", "Foo"), Identity("B", "Bar")}
)
def test_index_matches_brute_force_oracle():
threshold = 0.7
shared = "shared sentence one shared sentence two shared sentence three end"
pages = [
_page("A", "Doc1", shared),
_page("B", "Doc1copy", shared + " minor tail"), # near-dup of A
_page("C", "Other", "a totally distinct page with no overlapping shingles at all here"),
_page("D", "Lonely", "yet another isolated document about unrelated subject matter alone"),
]
idx = EquivalenceIndex(threshold=threshold)
idx.build(pages)
assert set(idx.groups()) == _brute_force_groups(pages, threshold)

View File

@@ -0,0 +1,84 @@
"""Incremental maintenance == rebuild, with retraction + propagation (SHARD-WP-0011 T2)."""
from shard_wiki.incremental import EquivalenceIndex
from shard_wiki.model import Identity, Page
from shard_wiki.provenance import ProvenanceEnvelope
def _page(shard, key, body):
return Page(
identity=Identity(shard, key),
body=body,
envelope=ProvenanceEnvelope(source_shard=shard),
)
def _rebuilt(pages, curator=()):
idx = EquivalenceIndex()
idx.build(pages, curator)
return idx
def _equal(a, b):
return a.edges() == b.edges() and set(a.groups()) == set(b.groups())
def test_add_keeps_index_equal_to_rebuild():
pages = [_page("A", "Foo", "same content here"), _page("B", "Bar", "same content here")]
idx = EquivalenceIndex()
for p in pages:
idx.add(p)
assert _equal(idx, _rebuilt(pages))
assert idx.groups() # the two collapse
def test_remove_keeps_index_equal_to_rebuild():
pages = [
_page("A", "Foo", "same content here"),
_page("B", "Bar", "same content here"),
_page("C", "Baz", "unrelated isolated material entirely"),
]
idx = _rebuilt(pages)
idx.remove(Identity("B", "Bar"))
assert _equal(idx, _rebuilt([pages[0], pages[2]]))
def test_edit_into_new_bucket_retracts_stale_edge():
a = _page("A", "Foo", "shared identical body text")
b = _page("B", "Bar", "shared identical body text")
idx = _rebuilt([a, b])
assert idx.groups() # A ≡ B initially
# Edit B to something completely different: it exits A's buckets, the edge is retracted.
b2 = _page("B", "Bar", "now totally divergent unrelated prose about nothing")
idx.update(b2)
assert idx.groups() == () # stale edge gone
assert _equal(idx, _rebuilt([a, b2]))
def test_edit_into_equivalence_adds_edge():
a = _page("A", "Foo", "target body to converge on later")
b = _page("B", "Bar", "initially completely separate writing here")
idx = _rebuilt([a, b])
assert idx.groups() == ()
b2 = _page("B", "Bar", "target body to converge on later") # now identical to A
idx.update(b2)
assert idx.equivalent_to(Identity("A", "Foo")) == frozenset(
{Identity("A", "Foo"), Identity("B", "Bar")}
)
assert _equal(idx, _rebuilt([a, b2]))
def test_removing_connector_splits_a_chorus():
# Curator chain A—B—C (no direct A—C): one group of three.
a, b, c = (_page("A", "X", "aaa"), _page("B", "Y", "bbb"), _page("C", "Z", "ccc"))
idx = EquivalenceIndex()
for p in (a, b, c):
idx.add(p)
idx.bind(a.identity, b.identity)
idx.bind(b.identity, c.identity)
assert idx.equivalent_to(a.identity) == {a.identity, b.identity, c.identity}
# Removing the connector B retracts/propagates: the chorus splits.
idx.remove(b.identity)
assert idx.groups() == ()
chain = [(a.identity, b.identity), (b.identity, c.identity)]
assert _equal(idx, _rebuilt([a, c], curator=chain))

View File

@@ -0,0 +1,89 @@
"""Tests for I-2 verification — digest + consistency-checker (SHARD-WP-0011 T3)."""
from shard_wiki.incremental import (
ConsistencyChecker,
EquivalenceIndex,
derived_digest,
)
from shard_wiki.model import Identity, Page
from shard_wiki.provenance import ProvenanceEnvelope
def _page(shard, key, body):
return Page(
identity=Identity(shard, key),
body=body,
envelope=ProvenanceEnvelope(source_shard=shard),
)
def test_digest_is_stable_under_equivalent_event_orders():
pages = [
_page("A", "Foo", "shared body text here"),
_page("B", "Bar", "shared body text here"),
_page("C", "Baz", "an entirely separate unrelated document"),
]
forward = EquivalenceIndex()
for p in pages:
forward.add(p)
reverse = EquivalenceIndex()
for p in reversed(pages):
reverse.add(p)
assert derived_digest(forward) == derived_digest(reverse)
def test_clean_index_reports_healthy():
pages = [_page("A", "Foo", "same body"), _page("B", "Bar", "same body")]
idx = EquivalenceIndex()
idx.build(pages)
checker = ConsistencyChecker(idx, pages_fn := (lambda: pages))
report = checker.check_and_repair()
assert report.drifted is False and report.healthy is True
assert pages_fn() # source unchanged
def test_missed_delta_drift_is_detected_and_repaired():
a = _page("A", "Foo", "converging target body")
b = _page("B", "Bar", "initially unrelated separate text")
source = {"pages": [a, b]}
idx = EquivalenceIndex()
idx.build(source["pages"])
assert idx.groups() == () # not equivalent yet
# Source changes B to match A, but the index is never told (a missed delta → drift).
b2 = _page("B", "Bar", "converging target body")
source["pages"] = [a, b2]
checker = ConsistencyChecker(idx, lambda: source["pages"])
report = checker.check_and_repair()
assert report.drifted is True and report.repaired is True and report.healthy is True
# Self-healed: the index now reflects the equivalence.
assert idx.equivalent_to(Identity("A", "Foo")) == frozenset(
{Identity("A", "Foo"), Identity("B", "Bar")}
)
def test_corrupted_internal_state_is_healed():
a = _page("A", "Foo", "identical content")
b = _page("B", "Bar", "identical content")
idx = EquivalenceIndex()
idx.build([a, b])
# Corrupt the derived tier directly: delete a true edge (simulated index corruption).
idx._content_edges.clear()
assert idx.groups() == () # corrupted away
checker = ConsistencyChecker(idx, lambda: [a, b])
report = checker.check_and_repair()
assert report.drifted is True and report.healthy is True
assert idx.groups() # edge restored by scoped recompute
def test_removed_source_page_is_reconciled():
a = _page("A", "Foo", "same body")
b = _page("B", "Bar", "same body")
idx = EquivalenceIndex()
idx.build([a, b])
checker = ConsistencyChecker(idx, lambda: [a]) # B vanished from source
report = checker.check_and_repair()
assert report.healthy is True
assert Identity("B", "Bar") not in idx.identities()

View File

@@ -0,0 +1,74 @@
"""Wire the incremental tier behind InformationSpace views (SHARD-WP-0011 T4)."""
from shard_wiki.adapters import FolderAdapter
from shard_wiki.coordination import EventType
from shard_wiki.model import Identity
from shard_wiki.space import InformationSpace
from shard_wiki.views import all_pages
def _shard(tmp_path, name, files):
root = tmp_path / name
for rel, text in files.items():
p = root / rel
p.parent.mkdir(parents=True, exist_ok=True)
p.write_text(text, encoding="utf-8")
return FolderAdapter(name, root)
def test_all_pages_via_index_matches_direct_fold(tmp_path):
space = InformationSpace("space")
space.attach(_shard(tmp_path, "wiki", {"Home.md": "welcome", "Guide.md": "the guide"}))
space.attach(_shard(tmp_path, "notes", {"Daily.md": "today"}))
# Routed-through-index result equals the direct fold-based computation (behaviour unchanged).
via_index = {(e.name, e.members) for e in space.all_pages()}
direct = {(e.name, e.members) for e in all_pages(space.union)}
assert via_index == direct
def test_curator_binding_collapses_via_maintained_index(tmp_path):
space = InformationSpace("space")
space.attach(_shard(tmp_path, "a", {"Foo.md": "x"}))
space.attach(_shard(tmp_path, "b", {"Bar.md": "y"}))
space.log.append(
"space", EventType.BINDING_MADE, {"members": ["a:Foo", "b:Bar"]}
)
# The maintained index re-syncs curator edges live from the log fold.
collapsed = [e for e in space.all_pages() if len(e.members) == 2]
assert len(collapsed) == 1
assert set(collapsed[0].members) == {Identity("a", "Foo"), Identity("b", "Bar")}
def test_content_duplicate_collapses_via_index(tmp_path):
space = InformationSpace("space")
space.attach(_shard(tmp_path, "a", {"Foo.md": "the very same body content here"}))
space.attach(_shard(tmp_path, "b", {"Bar.md": "the very same body content here"}))
dup = [e for e in space.all_pages() if len(e.members) == 2]
assert len(dup) == 1 # content equivalence detected by the maintained index
assert set(dup[0].members) == {Identity("a", "Foo"), Identity("b", "Bar")}
def test_attach_invalidates_index(tmp_path):
space = InformationSpace("space")
space.attach(_shard(tmp_path, "a", {"Foo.md": "same body"}))
assert space.all_pages() # builds the index (one page, no groups)
space.attach(_shard(tmp_path, "b", {"Bar.md": "same body"})) # marks index stale
dup = [e for e in space.all_pages() if len(e.members) == 2]
assert len(dup) == 1 # rebuilt fallback picks up the new equivalent page
def test_verify_index_reports_healthy_when_consistent(tmp_path):
space = InformationSpace("space")
space.attach(_shard(tmp_path, "a", {"Foo.md": "same body"}))
space.attach(_shard(tmp_path, "b", {"Bar.md": "same body"}))
space.all_pages() # ensure built
report = space.verify_index()
assert report.healthy is True
def test_reindex_is_an_explicit_fallback(tmp_path):
space = InformationSpace("space")
space.attach(_shard(tmp_path, "a", {"Foo.md": "content"}))
before = space.index.digest()
space.reindex()
assert space.index.digest() == before # rebuild is deterministic

View File

@@ -0,0 +1,76 @@
"""Tests for the AllPages + SiteMap enumeration views (SHARD-WP-0010 T4)."""
from shard_wiki.adapters import FolderAdapter
from shard_wiki.coordination import DecisionLog, EventType
from shard_wiki.model import Identity
from shard_wiki.union import UnionGraph
from shard_wiki.views import all_pages, site_map
def _shard(tmp_path, name, files):
root = tmp_path / name
for rel, text in files.items():
p = root / rel
p.parent.mkdir(parents=True, exist_ok=True)
p.write_text(text, encoding="utf-8")
return FolderAdapter(name, root)
def test_all_pages_spans_shards(tmp_path):
u = UnionGraph("space")
u.attach(_shard(tmp_path, "shardA", {"A.md": "a"}))
u.attach(_shard(tmp_path, "shardB", {"B.md": "b"}))
names = {e.name for e in all_pages(u)}
assert names == {"A", "B"}
def test_chorus_collapses_to_one_entry_with_divergence(tmp_path):
u = UnionGraph("space")
u.attach(_shard(tmp_path, "shardA", {"Home.md": "A home"}))
u.attach(_shard(tmp_path, "shardB", {"Home.md": "B home"}))
entries = all_pages(u)
home = [e for e in entries if e.name == "Home"]
assert len(home) == 1 # chorus → single entry
assert set(home[0].members) == {Identity("shardA", "Home"), Identity("shardB", "Home")}
assert home[0].diverges is True # bodies differ — collapse acknowledged, not silent
def test_chorus_same_body_does_not_diverge(tmp_path):
u = UnionGraph("space")
u.attach(_shard(tmp_path, "shardA", {"Home.md": "same"}))
u.attach(_shard(tmp_path, "shardB", {"Home.md": "same"}))
(home,) = [e for e in all_pages(u) if e.name == "Home"]
assert home.diverges is False
def test_equivalence_binding_collapses_distinct_keys(tmp_path):
log = DecisionLog()
log.append(
"space", EventType.BINDING_MADE, {"members": ["shardA:Foo", "shardB:Bar"]}
)
u = UnionGraph("space", log=log)
u.attach(_shard(tmp_path, "shardA", {"Foo.md": "x"}))
u.attach(_shard(tmp_path, "shardB", {"Bar.md": "x"}))
pair = {Identity("shardA", "Foo"), Identity("shardB", "Bar")}
# The two bound identities fold into one entry (named by the min key, "Bar").
bound = [e for e in all_pages(u) if {*e.members} == pair]
assert len(bound) == 1
assert bound[0].name == "Bar"
def test_sitemap_reflects_namespace_paths(tmp_path):
u = UnionGraph("space")
u.attach(
_shard(
tmp_path,
"shardA",
{"Home.md": "h", "docs/Guide.md": "g", "docs/api/Ref.md": "r"},
)
)
root = site_map(u)
# Top level: "Home" page directly, and a "docs" namespace.
assert any(p.key == "Home" for p in root.pages)
docs = next(c for c in root.children if c.name == "docs")
assert any(p.key == "docs/Guide" for p in docs.pages)
api = next(c for c in docs.children if c.name == "api")
assert any(p.key == "docs/api/Ref" for p in api.pages)

View File

@@ -0,0 +1,51 @@
"""Tests for the BackLinks derived view (SHARD-WP-0010 T2)."""
from shard_wiki.adapters import FolderAdapter
from shard_wiki.model import Identity
from shard_wiki.union import UnionGraph
from shard_wiki.views import build_backlinks
def _shard(tmp_path, name, files):
root = tmp_path / name
for rel, text in files.items():
p = root / rel
p.parent.mkdir(parents=True, exist_ok=True)
p.write_text(text, encoding="utf-8")
return FolderAdapter(name, root)
def test_link_yields_backlink_with_provenance(tmp_path):
u = UnionGraph("space")
u.attach(_shard(tmp_path, "shardA", {"A.md": "see [[B]]", "B.md": "target"}))
index = build_backlinks(u)
assert index.sources("B") == frozenset({Identity("shardA", "A")})
(bl,) = index.to("B")
assert bl.source_shard == "shardA" # entry carries source provenance
def test_red_links_create_no_backlinks(tmp_path):
u = UnionGraph("space")
u.attach(_shard(tmp_path, "shardA", {"A.md": "see [[Ghost]]"}))
index = build_backlinks(u)
assert index.to("Ghost") == () # unresolved target → no backlink
assert "Ghost" not in index.names()
def test_chorus_target_aggregates_backlinks(tmp_path):
# "Home" exists in two shards (a chorus); links to it from anywhere aggregate under one name.
u = UnionGraph("space")
u.attach(_shard(tmp_path, "shardA", {"Home.md": "A home", "A.md": "[[Home]]"}))
u.attach(_shard(tmp_path, "shardB", {"Home.md": "B home", "B.md": "[[Home]]"}))
index = build_backlinks(u)
assert index.sources("Home") == frozenset(
{Identity("shardA", "A"), Identity("shardB", "B")}
)
def test_backlinks_span_shards(tmp_path):
u = UnionGraph("space")
u.attach(_shard(tmp_path, "shardA", {"Index.md": "x"}))
u.attach(_shard(tmp_path, "shardB", {"B.md": "links [[Index]]"}))
index = build_backlinks(u)
assert index.sources("Index") == frozenset({Identity("shardB", "B")})

View File

@@ -0,0 +1,52 @@
"""Integration: derived views exposed on InformationSpace over two shards (SHARD-WP-0010 T5)."""
from shard_wiki.adapters import FolderAdapter
from shard_wiki.model import Identity
from shard_wiki.space import InformationSpace
def _shard(tmp_path, name, files):
root = tmp_path / name
for rel, text in files.items():
p = root / rel
p.parent.mkdir(parents=True, exist_ok=True)
p.write_text(text, encoding="utf-8")
return FolderAdapter(name, root)
def _space(tmp_path):
space = InformationSpace("space")
space.attach(
_shard(tmp_path, "wiki", {"Home.md": "welcome, see [[Guide]]", "Guide.md": "the guide"})
)
space.attach(_shard(tmp_path, "notes", {"Daily.md": "today I read [[Guide]]"}))
return space
def test_backlinks_across_two_shards(tmp_path):
space = _space(tmp_path)
sources = {bl.source for bl in space.backlinks("Guide")}
assert sources == {Identity("wiki", "Home"), Identity("notes", "Daily")}
def test_all_pages_and_site_map_over_union(tmp_path):
space = _space(tmp_path)
names = {e.name for e in space.all_pages()}
assert names == {"Home", "Guide", "Daily"}
leaves = {p.key for p in space.site_map().pages}
assert {"Home", "Guide", "Daily"} <= leaves
def test_recent_changes_includes_alias_and_edits(tmp_path):
space = _space(tmp_path)
space.alias("Start", "wiki:Home", actor="ana")
feed = space.recent_changes()
kinds = {e.kind for e in feed}
assert "alias" in kinds and "edit" in kinds
alias = next(e for e in feed if e.kind == "alias")
assert alias.source == "coordination" and alias.actor == "ana"
def test_red_link_creates_no_backlink_via_space(tmp_path):
space = _space(tmp_path)
assert space.backlinks("Nonexistent") == ()

69
tests/test_views_links.py Normal file
View File

@@ -0,0 +1,69 @@
"""Tests for the wikilink + red-link model (SHARD-WP-0010 T1)."""
from shard_wiki.adapters import FolderAdapter
from shard_wiki.union import ResolutionKind, UnionGraph
from shard_wiki.views import extract_links, resolve_links
def _shard(tmp_path, name, files):
root = tmp_path / name
for rel, text in files.items():
p = root / rel
p.parent.mkdir(parents=True, exist_ok=True)
p.write_text(text, encoding="utf-8")
return FolderAdapter(name, root)
def test_extracts_plain_and_labelled_links():
links = extract_links("See [[Home]] and [[Index|the index]].")
assert [(link.target, link.label, link.text) for link in links] == [
("Home", None, "Home"),
("Index", "the index", "the index"),
]
def test_links_carry_body_offsets_in_document_order():
body = "a [[One]] b [[Two]]"
links = extract_links(body)
assert [link.target for link in links] == ["One", "Two"]
s, e = links[0].span
assert body[s:e] == "[[One]]"
def test_code_regions_are_not_scanned():
body = "real [[Home]]\n```\n[[NotALink]]\n```\ninline `[[AlsoNot]]` done"
targets = [link.target for link in extract_links(body)]
assert targets == ["Home"]
def test_camelcase_off_by_default_then_opt_in():
body = "FrontPage links to [[Home]]"
assert [link.target for link in extract_links(body)] == ["Home"] # CamelCase ignored
on = extract_links(body, camelcase=True)
assert {link.target for link in on} == {"FrontPage", "Home"}
assert next(link for link in on if link.target == "FrontPage").auto is True
def test_camelcase_does_not_double_count_inside_explicit_link():
# [[FrontPage]] is one explicit link, not also a CamelCase auto-link.
links = extract_links("[[FrontPage]]", camelcase=True)
assert len(links) == 1
assert links[0].auto is False
def test_resolve_links_distinguishes_link_from_red_link(tmp_path):
u = UnionGraph("space")
u.attach(_shard(tmp_path, "shardA", {"Home.md": "home"}))
resolved = resolve_links(u, "[[Home]] and [[Ghost]]")
by_target = {r.link.target: r for r in resolved}
assert by_target["Home"].resolution.kind is ResolutionKind.SINGLE
assert by_target["Home"].is_red_link is False
assert by_target["Ghost"].is_red_link is True # unresolved → createable red-link
def test_resolve_links_surfaces_chorus(tmp_path):
u = UnionGraph("space")
u.attach(_shard(tmp_path, "shardA", {"Home.md": "A"}))
u.attach(_shard(tmp_path, "shardB", {"Home.md": "B"}))
(resolved,) = resolve_links(u, "[[Home]]")
assert resolved.resolution.kind is ResolutionKind.CHORUS

View File

@@ -0,0 +1,67 @@
"""Tests for the RecentChanges merged feed (SHARD-WP-0010 T3)."""
import os
from datetime import datetime, timezone
from shard_wiki.adapters import FolderAdapter
from shard_wiki.coordination import DecisionLog, EventType
from shard_wiki.union import UnionGraph
from shard_wiki.views import recent_changes
def _shard(tmp_path, name, files, mtime=None):
root = tmp_path / name
for rel, text in files.items():
p = root / rel
p.parent.mkdir(parents=True, exist_ok=True)
p.write_text(text, encoding="utf-8")
if mtime is not None:
os.utime(p, (mtime, mtime))
return FolderAdapter(name, root)
def test_edit_and_alias_both_appear_newest_first(tmp_path):
# Page edit signal pinned to an old mtime; the alias decision happens "now" → alias is newest.
old = datetime(2020, 1, 1, tzinfo=timezone.utc).timestamp()
u = UnionGraph("space")
u.attach(_shard(tmp_path, "shardA", {"Home.md": "home"}, mtime=old))
log = DecisionLog()
log.append("space", EventType.ALIAS_SET, {"alias": "Start", "target": "shardA:Home"})
feed = recent_changes(u, log, "space")
kinds = [e.kind for e in feed]
assert "edit" in kinds and "alias" in kinds
assert feed[0].kind == "alias" # newest first
assert feed[-1].kind == "edit"
# Monotonic non-increasing by time.
assert all(feed[i].when >= feed[i + 1].when for i in range(len(feed) - 1))
def test_per_shard_attribution_present(tmp_path):
u = UnionGraph("space")
u.attach(_shard(tmp_path, "shardA", {"A.md": "a"}))
u.attach(_shard(tmp_path, "shardB", {"B.md": "b"}))
feed = recent_changes(u, DecisionLog(), "space")
edits = {e.ref: e.source for e in feed if e.kind == "edit"}
assert edits["shardA:A"] == "shardA"
assert edits["shardB:B"] == "shardB" # each edit attributed to its shard
def test_coordination_entries_carry_actor_and_ref(tmp_path):
u = UnionGraph("space")
u.attach(_shard(tmp_path, "shardA", {"Doc.md": "x"}))
log = DecisionLog()
log.append(
"space", EventType.PAGE_FORKED, {"source": "shardA:Doc", "fork": "shardB:Doc"}, actor="ana"
)
fork = next(e for e in recent_changes(u, log, "space") if e.kind == "fork")
assert fork.source == "coordination"
assert fork.actor == "ana"
assert fork.ref == "shardA:Doc→shardB:Doc"
def test_limit_truncates_to_newest(tmp_path):
u = UnionGraph("space")
u.attach(_shard(tmp_path, "shardA", {"A.md": "a", "B.md": "b", "C.md": "c"}))
feed = recent_changes(u, DecisionLog(), "space", limit=2)
assert len(feed) == 2

View File

@@ -2,7 +2,7 @@
id: SHARD-WP-0001
type: workplan
title: "shard-wiki requirements from yawex prior art"
domain: whynot
domain: consumer
repo: shard-wiki
status: done
owner: tegwick

View File

@@ -2,7 +2,7 @@
id: SHARD-WP-0002
type: workplan
title: "federation architecture design"
domain: whynot
domain: consumer
repo: shard-wiki
status: done
owner: tegwick

View File

@@ -2,7 +2,7 @@
id: SHARD-WP-0003
type: workplan
title: "wiki-engine deep-dive batch (new-insight + git-forge + classic engines)"
domain: whynot
domain: consumer
repo: shard-wiki
status: done
owner: tegwick

View File

@@ -2,7 +2,7 @@
id: SHARD-WP-0004
type: workplan
title: "computational / interactive-knowledge systems research"
domain: whynot
domain: consumer
repo: shard-wiki
status: done
owner: tegwick

View File

@@ -2,7 +2,7 @@
id: SHARD-WP-0005
type: workplan
title: "core architecture hardening (blueprint review fixes)"
domain: whynot
domain: consumer
repo: shard-wiki
status: done
owner: tegwick

View File

@@ -2,7 +2,7 @@
id: SHARD-WP-0006
type: workplan
title: "core architecture hardening II (round-2 review fixes)"
domain: whynot
domain: consumer
repo: shard-wiki
status: done
owner: tegwick

View File

@@ -2,7 +2,7 @@
id: SHARD-WP-0007
type: workplan
title: "foundation implementation — model, contract, decision log, union read"
domain: whynot
domain: consumer
repo: shard-wiki
status: done
owner: tegwick

View File

@@ -2,7 +2,7 @@
id: SHARD-WP-0008
type: workplan
title: "write path — overlay engine, writable adapter, apply-under-drift"
domain: whynot
domain: consumer
repo: shard-wiki
status: done
owner: tegwick

View File

@@ -2,7 +2,7 @@
id: SHARD-WP-0009
type: workplan
title: "git-backed DecisionLog + per-space append authority"
domain: whynot
domain: consumer
repo: shard-wiki
status: done
owner: tegwick

View File

@@ -2,9 +2,9 @@
id: SHARD-WP-0010
type: workplan
title: "derived views — wikilinks, BackLinks, RecentChanges, AllPages/SiteMap"
domain: whynot
domain: consumer
repo: shard-wiki
status: active
status: done
owner: tegwick
topic_slug: whynot
created: "2026-06-15"
@@ -36,7 +36,7 @@ later by SHARD-WP-0011) and carry provenance. Presentation stays out of core (L6
```task
id: SHARD-WP-0010-T1
status: todo
status: done
priority: high
state_hub_task_id: "792660c3-9be9-4771-9f51-69d01f0c7f13"
```
@@ -51,7 +51,7 @@ red-link, CamelCase opt-in.
```task
id: SHARD-WP-0010-T2
status: todo
status: done
priority: high
state_hub_task_id: "431a54c3-82b5-4b08-b3f0-762624d4c91d"
```
@@ -65,7 +65,7 @@ chorus pages aggregate.
```task
id: SHARD-WP-0010-T3
status: todo
status: done
priority: medium
state_hub_task_id: "270c1c31-0445-42b9-9a49-92d32c298eb2"
```
@@ -79,7 +79,7 @@ alias both appear, newest-first; per-shard attribution present.
```task
id: SHARD-WP-0010-T4
status: todo
status: done
priority: low
state_hub_task_id: "898ba43e-cdef-4ce8-9fa3-4ce60ebb4fdd"
```
@@ -92,7 +92,7 @@ collapses to one entry with divergence noted; sitemap reflects paths.
```task
id: SHARD-WP-0010-T5
status: todo
status: done
priority: medium
state_hub_task_id: "7157544b-5d3b-45a2-ba5a-c32244c59323"
```

View File

@@ -2,9 +2,9 @@
id: SHARD-WP-0011
type: workplan
title: "incremental union maintenance + equivalence index + I-2 verification"
domain: whynot
domain: consumer
repo: shard-wiki
status: active
status: done
owner: tegwick
topic_slug: whynot
created: "2026-06-15"
@@ -41,7 +41,7 @@ deployment is later.
```task
id: SHARD-WP-0011-T1
status: todo
status: done
priority: high
state_hub_task_id: "842f480b-7b14-47cd-818b-012dbda9c187"
```
@@ -55,7 +55,7 @@ unrelated pages don't; verified edges match a brute-force oracle on a small corp
```task
id: SHARD-WP-0011-T2
status: todo
status: done
priority: high
state_hub_task_id: "2da4e0b8-22cc-4ad1-a9aa-b5e991515d30"
```
@@ -70,7 +70,7 @@ stale edge.
```task
id: SHARD-WP-0011-T3
status: todo
status: done
priority: high
state_hub_task_id: "b602ce31-ad9a-4c7f-b596-f039722373fc"
```
@@ -85,7 +85,7 @@ equivalent event orders.
```task
id: SHARD-WP-0011-T4
status: todo
status: done
priority: medium
state_hub_task_id: "2f3d083c-0b2e-4b58-9e96-c0461c5eb089"
```

View File

@@ -2,9 +2,9 @@
id: SHARD-WP-0012
type: workplan
title: "second adapter — git-IS-store shard (contract validation on a new substrate)"
domain: whynot
domain: consumer
repo: shard-wiki
status: active
status: done
owner: tegwick
topic_slug: whynot
created: "2026-06-15"
@@ -40,7 +40,7 @@ merge beyond fast-forward (apply-under-drift refuse is enough, as in SHARD-WP-00
```task
id: SHARD-WP-0012-T1
status: todo
status: done
priority: high
state_hub_task_id: "8a1c7c80-a0cc-4e02-a611-1f1fd7dec57b"
```
@@ -54,7 +54,7 @@ implication rules. Tests: read tracked files; profile validates; conformance rea
```task
id: SHARD-WP-0012-T2
status: todo
status: done
priority: high
state_hub_task_id: "b47dfb86-46c1-4e97-a62f-377719499ff2"
```
@@ -68,7 +68,7 @@ changes after an external commit.
```task
id: SHARD-WP-0012-T3
status: todo
status: done
priority: medium
state_hub_task_id: "4c895f42-671d-4948-8bdf-941fd85644bb"
```

View File

@@ -2,7 +2,7 @@
id: SHARD-WP-0013
type: workplan
title: "wiki-engine prep — reuse-surface registration, UC-catalog systematization, WikiEngineCoreArchitecture"
domain: whynot
domain: consumer
repo: shard-wiki
status: done
owner: tegwick

View File

@@ -2,7 +2,7 @@
id: SHARD-WP-0014
type: workplan
title: "wiki-engine implementation — kernel + typed-extension runtime + activation"
domain: whynot
domain: consumer
repo: shard-wiki
status: done
owner: tegwick