From 0eea94d05ec5dfc9b38eb17be031ae93b0dc1bad Mon Sep 17 00:00:00 2001
From: tegwick <bernd.worsch@gmail.com>
Date: Mon, 18 May 2026 23:56:41 +0200
Subject: [PATCH] Implement refinement hardening workplan

---
 README.md                                     |   6 +-
 docs/maturity-scorecard.md                    |  91 +++++-----
 docs/operational-readiness.md                 | 136 ++++++++++++++
 src/phase_memory/__init__.py                  |   9 +-
 src/phase_memory/adapters.py                  | 158 +++++++++++++++-
 src/phase_memory/external_adapters.py         | 131 +++++++++++++-
 src/phase_memory/policy.py                    |   1 +
 src/phase_memory/ports.py                     |   1 +
 src/phase_memory/runtime.py                   |  86 +++++++++
 src/phase_memory/service.py                   |  21 +++
 tests/fixtures/evaluation-scenarios.json      | 170 ++++++++++++++++++
 tests/test_evaluation_scenarios.py            | 101 +++++++++++
 tests/test_external_adapter_packs.py          |  38 +++-
 tests/test_file_backed_runtime.py             |  38 ++++
 tests/test_service_readiness.py               |  55 +++++-
 ...ent-hardening-and-operational-readiness.md |  38 +++-
 ...e-adapter-and-service-binding-readiness.md | 152 ++++++++++++++++
 17 files changed, 1164 insertions(+), 68 deletions(-)
 create mode 100644 docs/operational-readiness.md
 create mode 100644 tests/fixtures/evaluation-scenarios.json
 create mode 100644 tests/test_evaluation_scenarios.py
 create mode 100644 workplans/PMEM-WP-0012-live-adapter-and-service-binding-readiness.md

diff --git a/README.md b/README.md
index fe260fd..0164a5a 100644
--- a/README.md
+++ b/README.md
@@ -95,6 +95,6 @@ for package bridge boundaries, [docs/activation-quality.md](docs/activation-qual
 for retrieval and evaluation behavior, [docs/service-readiness.md](docs/service-readiness.md)
 for service and adapter contracts, [docs/lifecycle-rules.md](docs/lifecycle-rules.md)
 for profile-driven lifecycle rules, [docs/external-adapter-packs.md](docs/external-adapter-packs.md)
-for fake external integration packs, [docs/maturity-scorecard.md](docs/maturity-scorecard.md)
-for the current maturity assessment, and [SCOPE.md](SCOPE.md) for repository
-boundaries.
+for fake external integration packs, [docs/operational-readiness.md](docs/operational-readiness.md)
+for the local end-to-end operational recipe, [docs/maturity-scorecard.md](docs/maturity-scorecard.md)
+for the current maturity assessment, and [SCOPE.md](SCOPE.md) for repository boundaries.
diff --git a/docs/maturity-scorecard.md b/docs/maturity-scorecard.md
index 223a5bf..19b9850 100644
--- a/docs/maturity-scorecard.md
+++ b/docs/maturity-scorecard.md
@@ -26,74 +26,83 @@ to 5.
 
 ## Current Score
 
-Overall maturity: **3.8 / 5**
+Overall maturity: **4.0 / 5**
 
 Two sub-scores make the result easier to reason about:
 
-- Local integration maturity: **4.1 / 5**
-- Operational maturity: **3.2 / 5**
+- Local integration maturity: **4.3 / 5**
+- Operational maturity: **3.5 / 5**
 
 The repo is strong as a deterministic local library and service-boundary core.
 It is not yet production-operational because the external adapters are fakes,
-durability semantics are basic, service bindings are framework-neutral shapes
-rather than deployable endpoints, and evaluation coverage is still narrow.
+service bindings are framework-neutral shapes rather than deployable endpoints,
+and migration behavior is diagnostic rather than an operator-applied migration
+system.
 
 ## Dimension Scorecard
 
 | Dimension | Score | Target | Evidence | Needed Next |
 | --- | ---: | ---: | --- | --- |
 | Intent and boundaries | 4.4 | 5.0 | `INTENT.md`, `SCOPE.md`, `README.md`, architecture docs, adjacent-repo boundary docs | Keep docs current as live adapters and service bindings clarify real ownership. |
-| Package and API foundation | 4.2 | 4.5 | Python package, public exports, runtime facade, CLI, service config, dependency-light tests | Add API stability notes and compatibility checks for public exports. |
+| Package and API foundation | 4.3 | 4.5 | Python package, public exports, runtime facade, CLI, service runner export, service config, dependency-light tests | Add public export compatibility checks and release notes discipline. |
 | Markitect profile contract ingress | 3.7 | 4.5 | Profile loading, diagnostics, runtime envelopes, profile-derived config, local alias normalization | Add richer compatibility fixtures and schema drift diagnostics. |
-| Graph and event ingress | 3.7 | 4.5 | Graph loading, endpoint diagnostics, event model, JSONL log, export, repair checks, fake graph/event adapters | Add broader malformed/large graph fixtures and migration repair coverage. |
+| Graph and event ingress | 3.9 | 4.5 | Graph loading, endpoint diagnostics, event model, JSONL log, export, repair checks, corrupt-record diagnostics, fake graph/event adapters | Add broader malformed/large graph fixtures and operator repair utilities. |
 | Phase domain model | 3.5 | 4.5 | Phases, lifecycle states, actions, paths, retention rules, profile-derived transition rules | Add migration semantics for profile/rule changes over durable stores. |
-| Profile execution planning | 4.0 | 4.5 | Adapter plan, capabilities, policy gates, fallback behavior, config-driven local/external resolution | Add compatibility gates for live adapter packs. |
-| Lifecycle planning and apply | 3.6 | 4.5 | Dry-run lifecycle plans, profile rules, review-gated local apply | Add service `lifecycle.apply` handling, migration semantics, and better apply audit queries. |
-| Activation planning | 3.8 | 4.8 | Budgeted activation, selections, package request, graph neighborhoods, paths, ranking, metrics | Wire semantic-index-assisted retrieval and expand evaluation corpora. |
-| Local persistence | 3.2 | 4.5 | File-backed graph store, JSONL event log, audit sink, export, repair diagnostics | Add atomic writes, schema migration, compaction/retention utilities, and stronger corruption recovery. |
-| Policy, review, and audit | 3.5 | 5.0 | Operation points, review records, audit schema, denials, redaction, fake external policy/audit adapters | Add audit query service, retention policy behavior, and live policy adapter boundary. |
-| Observability and operations | 3.3 | 4.8 | Health report, config diagnostics, adapter status, fake telemetry audit sink | Add metrics/event export, retention diagnostics, and deployable health/readiness binding. |
+| Profile execution planning | 4.2 | 4.5 | Adapter plan, capabilities, policy gates, fallback behavior, config-driven local/external resolution, adapter pack manifests | Add compatibility gates for live adapter packs. |
+| Lifecycle planning and apply | 4.0 | 4.5 | Dry-run lifecycle plans, profile rules, review-gated local apply, service `lifecycle.apply`, apply audit queries | Add operator migration semantics and richer apply rollback/repair drills. |
+| Activation planning | 4.0 | 4.8 | Budgeted activation, selections, package request, graph neighborhoods, paths, ranking, metrics, multi-scenario evaluation fixtures | Wire semantic-index-assisted retrieval into runtime planning. |
+| Local persistence | 3.7 | 4.5 | File-backed graph store, JSONL event log, audit sink, atomic JSON writes, metadata migration diagnostics, export, repair diagnostics | Add executable migrations, compaction/retention utilities, and stronger corruption recovery. |
+| Policy, review, and audit | 3.9 | 5.0 | Operation points, review records, audit schema, queryable audit sinks, denials, redaction, fake external policy/audit adapters | Add live policy adapter boundary and enforceable audit retention policy. |
+| Observability and operations | 3.6 | 4.8 | Health report, config diagnostics, adapter status, fake telemetry audit sink, operational recipe | Add metrics/event export and deployable health/readiness binding. |
 | Markitect interop | 3.7 | 4.5 | Local validation, package request/response envelopes, fake compiler | Add optional live Markitect compiler adapter and contract compatibility suite. |
-| Kontextual/Infospace interop | 3.1 | 4.5 | Delegation envelope, fake runtime registry, activation quality report fixture | Add live/fake delegation scenarios and broader Infospace restart reports. |
-| Testing and evaluation | 3.8 | 4.5 | 60 deterministic tests over runtime, CLI, adapters, policy, activation, lifecycle, service, fakes | Add multi-profile/multi-graph evaluation corpus and regression thresholds. |
-| Service readiness | 3.9 | 4.8 | Service contracts, local runner, health, config, adapter conformance, fake pack | Implement missing service operations and optional framework binding. |
-| Developer experience | 3.8 | 4.5 | README, package map, CLI examples, persistence/policy/interop/service/lifecycle/fake-pack docs | Add troubleshooting, examples, and end-to-end recipes. |
+| Kontextual/Infospace interop | 3.3 | 4.5 | Delegation envelope, fake runtime registry, activation quality report fixture, adapter compatibility manifests | Add live/fake delegation scenarios and broader Infospace restart reports. |
+| Testing and evaluation | 4.1 | 4.5 | 70 deterministic tests over runtime, CLI, adapters, policy, activation, lifecycle, service, fakes, and evaluation scenarios | Add larger regression corpus and threshold trend reports. |
+| Service readiness | 4.2 | 4.8 | Service contracts, full local runner parity, health, config, adapter conformance, fake pack | Add optional framework binding and deployable readiness endpoints. |
+| Developer experience | 4.1 | 4.5 | README, package map, CLI examples, persistence/policy/interop/service/lifecycle/fake-pack docs, operational recipe | Add troubleshooting matrix and embedded-service examples. |
 
 ## Assessment
 
-The project has a credible core. The runtime envelopes, policy/review model,
-profile-derived configuration, lifecycle rules, local persistence, fake
-external pack, and conformance helpers form a solid integration boundary.
+The project has crossed the local integration-readiness threshold. The runtime
+envelopes, policy/review model, profile-derived configuration, lifecycle rules,
+local persistence diagnostics, queryable audit path, fake external pack
+manifests, and conformance helpers form a solid integration boundary.
 
-The biggest optimization opportunity is not another broad feature burst. It is
-closing the gap between declared contracts and runnable operational behavior:
-the service contract advertises operations that the local runner only partly
-handles, persistence needs migration/durability semantics, and evaluation needs
-more than one small fixture family.
+The biggest optimization opportunity is now the next operational layer:
+turning diagnostic-only durability into operator actions, adding optional
+deployable service bindings, and testing live or live-shaped adapters behind
+the same conformance suite as the fake pack.
 
-## Recommended Refinement Workplan
+## Completed Refinement Workplan
 
-Create and execute `PMEM-WP-0011`: refinement hardening and operational
-readiness.
+`PMEM-WP-0011` moved the score from 3.8 to 4.0 by adding:
+
+- full local service runner parity for `SERVICE_OPERATIONS`;
+- service-covered `package.compile`, `lifecycle.apply`, and `audit.query`;
+- queryable audit sinks with retention metadata;
+- local-store atomic JSON writes, migration diagnostics, and corrupt-record
+  repair diagnostics;
+- three evaluation scenario families covering policy denial, lifecycle rules,
+  event-path activation, semantic-index hints, and budget pressure;
+- adapter pack manifests and explicit missing-capability diagnostics;
+- an operational end-to-end recipe.
+
+## Recommended Next Refinement
+
+Create and execute `PMEM-WP-0012`: live-adapter and service-binding readiness.
 
 Highest-value tasks:
 
-- Bring service runner parity to the published operation catalog:
-  `package.compile`, `lifecycle.apply`, and `audit.query`.
-- Add local-store schema migration and repair hardening, including atomic write
-  behavior and migration diagnostics.
-- Expand evaluation fixtures across multiple profiles, graph shapes, policies,
-  lifecycle rules, and activation budgets.
-- Add live-adapter readiness manifests so fake and future live packs can be
-  tested by the same compatibility suite.
-- Add audit query and retention semantics that make policy/audit behavior
-  inspectable after runtime operations.
-- Improve DX with troubleshooting, end-to-end recipes, and API compatibility
-  notes.
+- Add an optional framework binding around `LocalServiceRunner` with health and
+  readiness endpoints.
+- Add executable local-store migrations, not only diagnostics.
+- Add live-shaped Markitect/Kontextual adapter fixtures behind the manifest and
+  conformance suite.
+- Add audit retention enforcement and telemetry export drills.
+- Grow the evaluation corpus into threshold reports that can catch regressions.
 
 ## Score Movement Gates
 
-Move overall score to **4.0** when:
+Achieved overall score **4.0** when:
 
 - Service runner handles every operation in `SERVICE_OPERATIONS`.
 - Audit query and lifecycle apply are covered through service contracts.
diff --git a/docs/operational-readiness.md b/docs/operational-readiness.md
new file mode 100644
index 0000000..0a72c67
--- /dev/null
+++ b/docs/operational-readiness.md
@@ -0,0 +1,136 @@
+# Operational Readiness Recipe
+
+Updated: 2026-05-18
+
+This recipe exercises the local operational surface without requiring live
+Markitect, Kontextual, or telemetry services. It is the expected smoke path for
+embedding `phase-memory` in another local agent runtime.
+
+## Local End-To-End Flow
+
+```python
+import json
+from pathlib import Path
+
+from phase_memory import LocalServiceRunner
+
+fixtures = Path("tests/fixtures")
+profile = json.loads((fixtures / "memory-profile.json").read_text(encoding="utf-8"))
+graph = json.loads((fixtures / "memory-graph.json").read_text(encoding="utf-8"))
+
+runner = LocalServiceRunner()
+
+profile_plan = runner.handle("profile.plan", {"profile": profile, "source_ref": "recipe:profile"})
+graph_import = runner.handle("graph.import", {"graph": graph, "source_ref": "recipe:graph"})
+lifecycle = runner.handle(
+    "graph.lifecycle.plan",
+    {
+        "profile": profile,
+        "graph": graph,
+        "parameters": {"refresh_digests": {"event.restart": "new-digest"}},
+        "source_ref": "recipe:lifecycle",
+    },
+)
+activation = runner.handle(
+    "graph.activation.plan",
+    {
+        "graph": graph,
+        "budget": {"max_items": 3, "max_tokens": 60},
+        "profile_id": profile["id"],
+        "source_ref": "recipe:activation",
+    },
+)
+package = runner.handle(
+    "package.compile",
+    {
+        "selection": activation["data"]["activation_plan"]["selection"],
+        "source_ref": "recipe:package",
+    },
+)
+audit = runner.handle("audit.query", {"filters": {"operation": "package.compile"}})
+health = runner.handle("health.check")
+```
+
+Expected checks:
+
+- `profile_plan["valid"]`, `graph_import["valid"]`, `activation["valid"]`, and
+  `package["valid"]` are true.
+- `lifecycle["data"]["dry_run_actions"]` contains the planned refresh action.
+- `audit["count"]` is at least 1 and `audit["retention"]` declares the active
+  audit sink retention mode.
+- `health["ok"]` is true.
+
+## Review-Gated Apply
+
+Lifecycle actions that require review are denied until an approval marker or
+matching review record is supplied:
+
+```python
+denied = runner.handle("lifecycle.apply", {"actions": lifecycle["data"]["dry_run_actions"]})
+approved = runner.handle(
+    "lifecycle.apply",
+    {
+        "actions": lifecycle["data"]["dry_run_actions"],
+        "approval_marker": "review:operator-approved",
+    },
+)
+```
+
+Use `audit.query` with `{"operation": "lifecycle.apply", "dry_run": False}` to
+trace denied and approved apply attempts.
+
+## Persistence Repair Drill
+
+File-backed operation is configured through a profile or explicit
+`RuntimeConfig`:
+
+```python
+from phase_memory import RuntimeConfig, LocalServiceRunner
+
+config = RuntimeConfig.from_profile(profile, local_store_path=".phase-memory-local")
+runner = LocalServiceRunner(config=config)
+repair = runner.runtime.repair_diagnostics(source_ref=config.local_store_path)
+```
+
+Repair diagnostics distinguish:
+
+- `store_migration_required` for old or missing local-store schema metadata.
+- `planned_store_migrations` when metadata declares pending migrations.
+- `corrupt_store_record` for unreadable node, edge, or path JSON.
+- `missing_edge_source` / `missing_edge_target` for graph reference damage.
+- `orphaned_path_event` when paths reference absent event-log records.
+
+## Adapter Pack Compatibility
+
+Fake and future live adapter packs should publish a manifest with:
+
+- declared capabilities;
+- ownership boundaries for every adapter;
+- required conformance helpers.
+
+Validate a pack before wiring it into the runtime:
+
+```python
+from phase_memory import fake_external_adapter_pack, validate_adapter_pack_manifest
+
+diagnostics = validate_adapter_pack_manifest(fake_external_adapter_pack())
+assert diagnostics == ()
+```
+
+Missing capabilities are reported as `missing_adapter_capability` diagnostics
+with the adapter and capability names attached.
+
+## API Compatibility Expectations
+
+The stable embedding surface is:
+
+- `PhaseMemoryRuntime` methods and JSON-serializable envelopes.
+- `LocalServiceRunner.handle(operation, payload)` for every operation in
+  `service_contracts()["operations"]`.
+- `RuntimeConfig` and `resolve_runtime_adapters` for local/external adapter
+  resolution.
+- Adapter conformance helpers in `phase_memory.service`.
+- External adapter pack manifests and validation helpers.
+
+New public operations should be added to the service contract first, then to
+the local runner, runtime tests, and docs in the same change.
diff --git a/src/phase_memory/__init__.py b/src/phase_memory/__init__.py
index c0a410b..5653b0e 100644
--- a/src/phase_memory/__init__.py
+++ b/src/phase_memory/__init__.py
@@ -11,6 +11,7 @@ from .bridge import (
 )
 from .contracts import graph_from_markitect, profile_from_markitect
 from .external_adapters import (
+    ADAPTER_PACK_MANIFEST_SCHEMA,
     ExternalAdapterPack,
     FakeExternalEventLog,
     FakeExternalGraphStore,
@@ -19,8 +20,10 @@ from .external_adapters import (
     FakeKontextualRuntimeRegistry,
     FakeMarkitectPackageCompiler,
     FakeTelemetryAuditSink,
+    adapter_pack_manifest,
     fake_external_adapter_pack,
     fake_external_runtime_config,
+    validate_adapter_pack_manifest,
 )
 from .lifecycle import (
     LifecycleRuleConfig,
@@ -63,12 +66,13 @@ from .retrieval import (
     retrieve_graph_neighborhood,
     select_event_path,
 )
-from .service import RuntimeAdapterBundle, RuntimeConfig, health_report, resolve_runtime_adapters, runtime_from_config, service_contracts
+from .service import LocalServiceRunner, RuntimeAdapterBundle, RuntimeConfig, health_report, resolve_runtime_adapters, runtime_from_config, service_contracts
 from .planner import plan_profile_execution
 from .runtime import PhaseMemoryRuntime
 
 __all__ = [
     "ActivationPlan",
+    "ADAPTER_PACK_MANIFEST_SCHEMA",
     "Diagnostic",
     "ExternalAdapterPack",
     "FakeExternalEventLog",
@@ -123,6 +127,8 @@ __all__ = [
     "profile_from_markitect",
     "fake_external_adapter_pack",
     "fake_external_runtime_config",
+    "adapter_pack_manifest",
+    "validate_adapter_pack_manifest",
     "path_event",
     "package_request_from_selection",
     "package_response_envelope",
@@ -132,6 +138,7 @@ __all__ = [
     "retrieve_graph_neighborhood",
     "select_event_path",
     "RuntimeConfig",
+    "LocalServiceRunner",
     "RuntimeAdapterBundle",
     "health_report",
     "resolve_runtime_adapters",
diff --git a/src/phase_memory/adapters.py b/src/phase_memory/adapters.py
index 5c22476..e74b198 100644
--- a/src/phase_memory/adapters.py
+++ b/src/phase_memory/adapters.py
@@ -9,6 +9,7 @@ from typing import Any
 from .models import Diagnostic, MemoryEdge, MemoryEvent, MemoryGraph, MemoryNode, MemoryPath, PolicyDecision, ProfileIntent
 
 LOCAL_STORE_SCHEMA = "phase_memory.local_store.v1"
+LOCAL_STORE_METADATA_FILE = "phase-memory.json"
 
 
 class InMemoryMemoryGraphStore:
@@ -141,27 +142,99 @@ class FileBackedMemoryGraphStore:
             metadata={"store_schema_version": LOCAL_STORE_SCHEMA, "store_path": str(self.root)},
         )
 
+    def metadata(self) -> dict[str, Any]:
+        return _read_json(self.root / LOCAL_STORE_METADATA_FILE)
+
     def repair_diagnostics(self, *, events: list[MemoryEvent] | None = None) -> tuple[Diagnostic, ...]:
         diagnostics: list[Diagnostic] = []
-        node_ids = {node.node_id for node in self.list_nodes()}
+        nodes, node_diagnostics = _read_records(self.nodes_dir, MemoryNode.from_mapping, record_type="node")
+        edges, edge_diagnostics = _read_records(self.edges_dir, MemoryEdge.from_mapping, record_type="edge")
+        paths, path_diagnostics = _read_records(self.paths_dir, MemoryPath.from_mapping, record_type="path")
+        diagnostics.extend(self.metadata_diagnostics())
+        diagnostics.extend(node_diagnostics)
+        diagnostics.extend(edge_diagnostics)
+        diagnostics.extend(path_diagnostics)
+
+        node_ids = {node.node_id for node in nodes}
         event_ids = {event.event_id for event in events or ()}
-        for edge in self.list_edges():
+        for edge in edges:
             if edge.source not in node_ids:
                 diagnostics.append(Diagnostic("error", "missing_edge_source", "Edge source does not reference a node.", edge.edge_id, {"source": edge.source}))
             if edge.target not in node_ids:
                 diagnostics.append(Diagnostic("error", "missing_edge_target", "Edge target does not reference a node.", edge.edge_id, {"target": edge.target}))
-        for path in self.list_paths():
+        for path in paths:
             for event_id in path.event_ids:
                 if event_id not in event_ids:
                     diagnostics.append(Diagnostic("warn", "orphaned_path_event", "Path references an event not present in the event log.", path.path_id, {"event_id": event_id}))
         return tuple(diagnostics)
 
+    def metadata_diagnostics(self) -> tuple[Diagnostic, ...]:
+        metadata_path = self.root / LOCAL_STORE_METADATA_FILE
+        if not metadata_path.exists():
+            return (
+                Diagnostic(
+                    "error",
+                    "missing_store_metadata",
+                    "Local store metadata file is missing.",
+                    str(metadata_path),
+                    {"expected_schema_version": LOCAL_STORE_SCHEMA},
+                ),
+            )
+        try:
+            metadata = _read_json(metadata_path)
+        except json.JSONDecodeError as exc:
+            return (
+                Diagnostic(
+                    "error",
+                    "corrupt_store_metadata",
+                    "Local store metadata file is not valid JSON.",
+                    str(metadata_path),
+                    {"error": str(exc)},
+                ),
+            )
+
+        diagnostics: list[Diagnostic] = []
+        schema_version = str(metadata.get("schema_version") or "")
+        if not schema_version:
+            diagnostics.append(
+                Diagnostic(
+                    "warn",
+                    "store_migration_required",
+                    "Local store metadata does not declare a schema version.",
+                    str(metadata_path),
+                    {"from_schema_version": "", "to_schema_version": LOCAL_STORE_SCHEMA},
+                )
+            )
+        elif schema_version != LOCAL_STORE_SCHEMA:
+            diagnostics.append(
+                Diagnostic(
+                    "warn",
+                    "store_migration_required",
+                    "Local store metadata declares a schema version that needs migration.",
+                    str(metadata_path),
+                    {"from_schema_version": schema_version, "to_schema_version": LOCAL_STORE_SCHEMA},
+                )
+            )
+
+        planned = metadata.get("planned_migrations") or metadata.get("migrations") or ()
+        if planned:
+            diagnostics.append(
+                Diagnostic(
+                    "warn",
+                    "planned_store_migrations",
+                    "Local store metadata declares planned migrations.",
+                    str(metadata_path),
+                    {"migrations": list(planned)},
+                )
+            )
+        return tuple(diagnostics)
+
     def _ensure_layout(self) -> None:
         for directory in (self.root, self.profiles_dir, self.nodes_dir, self.edges_dir, self.paths_dir, self.activations_dir):
             directory.mkdir(parents=True, exist_ok=True)
-        metadata_path = self.root / "phase-memory.json"
+        metadata_path = self.root / LOCAL_STORE_METADATA_FILE
         if not metadata_path.exists():
-            _write_json(metadata_path, {"schema_version": LOCAL_STORE_SCHEMA})
+            _write_json(metadata_path, {"schema_version": LOCAL_STORE_SCHEMA, "migrations": []})
 
 
 class JsonlMemoryEventLog:
@@ -244,6 +317,12 @@ class RecordingAuditSink:
         self.events.append(stored)
         return {"recorded": True, "index": len(self.events) - 1, "event": stored}
 
+    def query(self, **filters: Any) -> list[dict[str, Any]]:
+        return filter_audit_events(self.events, **filters)
+
+    def retention_metadata(self) -> dict[str, Any]:
+        return {"mode": "in_memory", "retention_days": None}
+
 
 class JsonlAuditSink:
     def __init__(self, path: str | Path) -> None:
@@ -259,6 +338,20 @@ class JsonlAuditSink:
             index = max(sum(1 for _ in handle) - 1, 0)
         return {"recorded": True, "index": index, "event": stored}
 
+    def query(self, **filters: Any) -> list[dict[str, Any]]:
+        events: list[dict[str, Any]] = []
+        for raw in self.path.read_text(encoding="utf-8").splitlines():
+            if not raw.strip():
+                continue
+            try:
+                events.append(json.loads(raw))
+            except json.JSONDecodeError:
+                continue
+        return filter_audit_events(events, **filters)
+
+    def retention_metadata(self) -> dict[str, Any]:
+        return {"mode": "jsonl", "path": str(self.path), "retention_days": None}
+
 
 class InMemorySemanticIndex:
     def __init__(self) -> None:
@@ -308,4 +401,57 @@ def _read_json(path: Path) -> dict[str, Any]:
 
 def _write_json(path: Path, data: dict[str, Any]) -> None:
     path.parent.mkdir(parents=True, exist_ok=True)
-    path.write_text(json.dumps(data, indent=2, sort_keys=True) + "\n", encoding="utf-8")
+    tmp_path = path.with_name(f".{path.name}.tmp")
+    tmp_path.write_text(json.dumps(data, indent=2, sort_keys=True) + "\n", encoding="utf-8")
+    tmp_path.replace(path)
+
+
+def _read_records(directory: Path, factory, *, record_type: str) -> tuple[list[Any], list[Diagnostic]]:
+    records: list[Any] = []
+    diagnostics: list[Diagnostic] = []
+    for path in sorted(directory.glob("*.json")):
+        try:
+            records.append(factory(_read_json(path)))
+        except (json.JSONDecodeError, ValueError, TypeError, KeyError) as exc:
+            diagnostics.append(
+                Diagnostic(
+                    "error",
+                    "corrupt_store_record",
+                    "Local store record could not be decoded.",
+                    str(path),
+                    {"record_type": record_type, "error": str(exc)},
+                )
+            )
+    return records, diagnostics
+
+
+def filter_audit_events(events: list[dict[str, Any]], **filters: Any) -> list[dict[str, Any]]:
+    return [dict(event) for event in events if _audit_event_matches(event, filters)]
+
+
+def _audit_event_matches(event: dict[str, Any], filters: dict[str, Any]) -> bool:
+    operation = filters.get("operation")
+    if operation is not None and event.get("operation") != operation:
+        return False
+    operation_id = filters.get("operation_id")
+    if operation_id is not None and event.get("operation_id") != operation_id:
+        return False
+    subject_kind = filters.get("subject_kind")
+    if subject_kind is not None and dict(event.get("subject") or {}).get("kind") != subject_kind:
+        return False
+    subject_id = filters.get("subject_id")
+    if subject_id is not None and dict(event.get("subject") or {}).get("id") != subject_id:
+        return False
+    source_ref = filters.get("source_ref")
+    if source_ref is not None and dict(event.get("source") or {}).get("ref") != source_ref:
+        return False
+    actor = filters.get("actor")
+    if actor is not None and event.get("actor") != actor:
+        return False
+    dry_run = filters.get("dry_run")
+    if dry_run is not None and bool(event.get("dry_run")) is not bool(dry_run):
+        return False
+    allowed = filters.get("allowed")
+    if allowed is not None and bool(event.get("allowed")) is not bool(allowed):
+        return False
+    return True
diff --git a/src/phase_memory/external_adapters.py b/src/phase_memory/external_adapters.py
index eadbe35..f9a255c 100644
--- a/src/phase_memory/external_adapters.py
+++ b/src/phase_memory/external_adapters.py
@@ -5,17 +5,48 @@ from __future__ import annotations
 from dataclasses import dataclass, field
 from typing import Any
 
-from .adapters import InMemoryMemoryEventLog, InMemoryMemoryGraphStore, InMemorySemanticIndex
-from .models import PolicyDecision
+from .adapters import InMemoryMemoryEventLog, InMemoryMemoryGraphStore, InMemorySemanticIndex, filter_audit_events
+from .models import Diagnostic, PolicyDecision
 from .service import RuntimeConfig
 from .utils import stable_digest
 
+ADAPTER_PACK_MANIFEST_SCHEMA = "phase_memory.adapter_pack.manifest.v1"
+ADAPTER_PACK_REQUIRED_ADAPTERS = (
+    "graph_store",
+    "event_log",
+    "policy_gateway",
+    "audit_sink",
+    "package_compiler",
+    "semantic_index",
+    "runtime_registry",
+)
+ADAPTER_CONFORMANCE_HELPERS = {
+    "graph_store": "assert_graph_store_conformance",
+    "event_log": "assert_event_log_conformance",
+    "policy_gateway": "assert_policy_gateway_conformance",
+    "audit_sink": "assert_audit_sink_conformance",
+    "package_compiler": "assert_context_compiler_conformance",
+    "semantic_index": "assert_semantic_index_conformance",
+    "runtime_registry": "assert_runtime_registry_conformance",
+}
+ADAPTER_REQUIRED_CAPABILITIES = {
+    "graph_store": ("kontextual.graph-store.fake",),
+    "event_log": ("kontextual.event-log.fake",),
+    "policy_gateway": ("policy.gateway.fake",),
+    "audit_sink": ("telemetry.audit.fake",),
+    "package_compiler": ("markitect.package.compile",),
+    "semantic_index": ("semantic-index.fake",),
+    "runtime_registry": ("kontextual.runtime.registry",),
+}
+
 
 @dataclass(frozen=True)
 class ExternalAdapterPack:
     name: str
     adapters: dict[str, Any]
     capabilities: tuple[str, ...] = ()
+    ownership_boundaries: dict[str, str] = field(default_factory=dict)
+    required_conformance: dict[str, str] = field(default_factory=dict)
     metadata: dict[str, Any] = field(default_factory=dict)
 
     def to_dict(self) -> dict[str, Any]:
@@ -23,9 +54,14 @@ class ExternalAdapterPack:
             "name": self.name,
             "adapters": {key: value.__class__.__name__ for key, value in sorted(self.adapters.items())},
             "capabilities": list(self.capabilities),
+            "ownership_boundaries": dict(self.ownership_boundaries),
+            "required_conformance": dict(self.required_conformance),
             "metadata": dict(self.metadata),
         }
 
+    def manifest(self) -> dict[str, Any]:
+        return adapter_pack_manifest(self)
+
 
 class FakeExternalGraphStore(InMemoryMemoryGraphStore):
     """Kontextual-shaped graph store fake backed by deterministic memory."""
@@ -92,10 +128,14 @@ class FakeTelemetryAuditSink:
             "event": stored,
         }
 
-    def query(self, *, operation: str | None = None) -> list[dict[str, Any]]:
-        if operation is None:
-            return list(self.events)
-        return [event for event in self.events if event.get("operation") == operation]
+    def query(self, **filters: Any) -> list[dict[str, Any]]:
+        return filter_audit_events(self.events, **filters)
+
+    def retention_metadata(self) -> dict[str, Any]:
+        return {
+            "mode": "fake_telemetry",
+            "retention_days": self.retention_days,
+        }
 
 
 class FakeKontextualRuntimeRegistry:
@@ -133,9 +173,21 @@ def fake_external_adapter_pack() -> ExternalAdapterPack:
             "markitect.package.compile",
             "kontextual.runtime.registry",
             "kontextual.graph-store.fake",
+            "kontextual.event-log.fake",
+            "policy.gateway.fake",
             "telemetry.audit.fake",
             "semantic-index.fake",
         ),
+        ownership_boundaries={
+            "graph_store": "kontextual owns durable graph records; phase-memory owns graph semantics",
+            "event_log": "kontextual owns event durability; phase-memory owns event shape",
+            "policy_gateway": "phase-memory owns policy decision contract; external pack owns gateway implementation",
+            "audit_sink": "external telemetry owns retention; phase-memory owns audit event shape",
+            "package_compiler": "markitect owns package compilation; phase-memory owns selection planning",
+            "semantic_index": "external retrieval owns index mechanics; phase-memory owns activation policy",
+            "runtime_registry": "kontextual owns envelope registry; phase-memory owns envelope contract",
+        },
+        required_conformance=dict(ADAPTER_CONFORMANCE_HELPERS),
         metadata={"intended_for": "local conformance and integration tests"},
     )
 
@@ -158,3 +210,70 @@ def fake_external_runtime_config() -> RuntimeConfig:
         runtime_registry_mode="external",
         trust_zone_labels=("external",),
     )
+
+
+def adapter_pack_manifest(pack: ExternalAdapterPack) -> dict[str, Any]:
+    return {
+        "schema_version": ADAPTER_PACK_MANIFEST_SCHEMA,
+        "name": pack.name,
+        "capabilities": sorted(pack.capabilities),
+        "metadata": dict(pack.metadata),
+        "adapters": {
+            key: {
+                "class": pack.adapters[key].__class__.__name__,
+                "ownership": pack.ownership_boundaries.get(key, ""),
+                "required_capabilities": list(ADAPTER_REQUIRED_CAPABILITIES.get(key, ())),
+                "required_conformance": pack.required_conformance.get(key, ADAPTER_CONFORMANCE_HELPERS.get(key, "")),
+            }
+            for key in sorted(pack.adapters)
+        },
+    }
+
+
+def validate_adapter_pack_manifest(pack: ExternalAdapterPack) -> tuple[Diagnostic, ...]:
+    diagnostics: list[Diagnostic] = []
+    capabilities = set(pack.capabilities)
+    for adapter in ADAPTER_PACK_REQUIRED_ADAPTERS:
+        if adapter not in pack.adapters:
+            diagnostics.append(
+                Diagnostic(
+                    "error",
+                    "missing_adapter",
+                    "Adapter pack is missing a required adapter.",
+                    adapter,
+                    {"adapter": adapter},
+                )
+            )
+            continue
+        if not pack.ownership_boundaries.get(adapter):
+            diagnostics.append(
+                Diagnostic(
+                    "warn",
+                    "missing_adapter_ownership",
+                    "Adapter pack does not declare an ownership boundary for this adapter.",
+                    adapter,
+                    {"adapter": adapter},
+                )
+            )
+        if not pack.required_conformance.get(adapter):
+            diagnostics.append(
+                Diagnostic(
+                    "warn",
+                    "missing_conformance_helper",
+                    "Adapter pack does not declare the conformance helper required for this adapter.",
+                    adapter,
+                    {"adapter": adapter},
+                )
+            )
+        for capability in ADAPTER_REQUIRED_CAPABILITIES.get(adapter, ()):
+            if capability not in capabilities:
+                diagnostics.append(
+                    Diagnostic(
+                        "error",
+                        "missing_adapter_capability",
+                        "Adapter pack does not declare a capability required by an adapter.",
+                        adapter,
+                        {"adapter": adapter, "capability": capability},
+                    )
+                )
+    return tuple(diagnostics)
diff --git a/src/phase_memory/policy.py b/src/phase_memory/policy.py
index 2883d94..796ab2b 100644
--- a/src/phase_memory/policy.py
+++ b/src/phase_memory/policy.py
@@ -21,6 +21,7 @@ class MemoryOperation(str, Enum):
     LIFECYCLE_PLAN = "graph.lifecycle.plan"
     ACTIVATION_PLAN = "graph.activation.plan"
     PACKAGE_COMPILE = "package.compile"
+    AUDIT_QUERY = "audit.query"
     LIFECYCLE_APPLY = "lifecycle.apply"
     STABILIZATION = "memory.stabilize"
     COMPACTION = "memory.compact"
diff --git a/src/phase_memory/ports.py b/src/phase_memory/ports.py
index d6b1a3d..b2b088f 100644
--- a/src/phase_memory/ports.py
+++ b/src/phase_memory/ports.py
@@ -39,6 +39,7 @@ class PolicyGateway(Protocol):
 
 class AuditSink(Protocol):
     def record(self, event: dict[str, Any]) -> dict[str, Any]: ...
+    def query(self, **filters: Any) -> list[dict[str, Any]]: ...
 
 
 class RuntimeRegistry(Protocol):
diff --git a/src/phase_memory/runtime.py b/src/phase_memory/runtime.py
index e2e1805..927e298 100644
--- a/src/phase_memory/runtime.py
+++ b/src/phase_memory/runtime.py
@@ -14,6 +14,7 @@ from .adapters import (
     InMemoryMemoryGraphStore,
     NoopContextPackageCompiler,
     RecordingAuditSink,
+    filter_audit_events,
 )
 from .bridge import MARKITECT_PACKAGE_REQUEST_SCHEMA, package_request_from_selection, package_response_envelope
 from .contracts import ContractIngressResult, graph_from_markitect, profile_from_markitect
@@ -36,6 +37,7 @@ from .ports import AuditSink, ContextPackageCompiler, MemoryEventLog, MemoryGrap
 from .utils import compact_dict, stable_digest, to_plain
 
 RUNTIME_ENVELOPE_SCHEMA = "phase_memory.runtime.envelope.v1"
+AUDIT_QUERY_SCHEMA = "phase_memory.audit.query.v1"
 PACKAGE_REQUEST_SCHEMA = MARKITECT_PACKAGE_REQUEST_SCHEMA
 
 
@@ -263,6 +265,41 @@ class PhaseMemoryRuntime:
             data={"package_request": request, "package_response": package_response_envelope(response, request_id=request["id"])},
         )
 
+    def query_audit(self, filters: dict[str, Any] | None = None, *, source_ref: str = "audit") -> dict[str, Any]:
+        filters = _audit_filters(filters or {})
+        policy = self.policy_gateway.authorize(
+            action="audit.query",
+            resource="audit:events",
+            context={"source_ref": source_ref, "dry_run": True, "filters": filters},
+        )
+        events, diagnostics = _query_audit_sink(self.audit_sink, filters) if policy.allowed else ([], ())
+        operation_id = f"op:{stable_digest(['audit.query', source_ref, filters])}"
+        audit = self.audit_sink.record(
+            audit_event(
+                operation_id=operation_id,
+                operation="audit.query",
+                subject={"kind": "audit_events", "id": stable_digest(filters)},
+                policy_decision=policy,
+                dry_run=True,
+                source_ref=source_ref,
+            )
+        )
+        return {
+            "schema_version": AUDIT_QUERY_SCHEMA,
+            "operation_id": operation_id,
+            "operation": "audit.query",
+            "dry_run": True,
+            "valid": policy.allowed and not any(diagnostic.severity == "error" for diagnostic in diagnostics),
+            "filters": filters,
+            "count": len(events),
+            "events": events,
+            "retention": _audit_retention_metadata(self.audit_sink),
+            "policy_decision": _policy_to_dict(policy),
+            "audit_receipt": audit,
+            "diagnostics": [diagnostic.to_dict() for diagnostic in diagnostics],
+            "source": {"ref": source_ref},
+        }
+
     def export_graph(self, *, graph_id: str = "local", source_ref: str = "local-store") -> dict[str, Any]:
         events = self.event_log.list_events()
         if hasattr(self.graph_store, "export_graph"):
@@ -450,6 +487,55 @@ def _policy_to_dict(decision: PolicyDecision) -> dict[str, Any]:
     return decision.to_dict() if hasattr(decision, "to_dict") else to_plain(decision)
 
 
+def _audit_filters(filters: dict[str, Any]) -> dict[str, Any]:
+    allowed_keys = {
+        "operation",
+        "operation_id",
+        "subject_kind",
+        "subject_id",
+        "source_ref",
+        "actor",
+        "dry_run",
+        "allowed",
+    }
+    return {key: filters[key] for key in sorted(allowed_keys) if key in filters and filters[key] is not None}
+
+
+def _query_audit_sink(sink: AuditSink, filters: dict[str, Any]) -> tuple[list[dict[str, Any]], tuple[Diagnostic, ...]]:
+    if hasattr(sink, "query"):
+        try:
+            return list(sink.query(**filters)), ()
+        except TypeError:
+            try:
+                events = sink.query(operation=filters.get("operation"))
+            except TypeError:
+                events = sink.query()
+            return filter_audit_events(list(events), **filters), ()
+    if hasattr(sink, "events"):
+        return filter_audit_events(list(getattr(sink, "events")), **filters), ()
+    return (
+        [],
+        (
+            Diagnostic(
+                "error",
+                "audit_query_unsupported",
+                "Audit sink does not expose queryable audit records.",
+                sink.__class__.__name__,
+            ),
+        ),
+    )
+
+
+def _audit_retention_metadata(sink: AuditSink) -> dict[str, Any]:
+    if hasattr(sink, "retention_metadata"):
+        return dict(sink.retention_metadata())
+    retention_days = getattr(sink, "retention_days", None)
+    return {
+        "mode": sink.__class__.__name__,
+        "retention_days": retention_days,
+    }
+
+
 def _coerce_action(data: LifecycleAction | dict[str, Any]) -> LifecycleAction:
     if isinstance(data, LifecycleAction):
         return data
diff --git a/src/phase_memory/service.py b/src/phase_memory/service.py
index 76fe564..2ede028 100644
--- a/src/phase_memory/service.py
+++ b/src/phase_memory/service.py
@@ -379,6 +379,26 @@ class LocalServiceRunner:
                 max_items=int(budget["max_items"]),
                 max_tokens=int(budget["max_tokens"]),
                 profile_id=payload.get("profile_id"),
+                priority_node_ids=tuple(payload.get("priority_node_ids") or ()),
+                include_events=bool(payload.get("include_events", True)),
+                policy_context=dict(payload.get("policy_context") or {}),
+            )
+        if operation == "package.compile":
+            return self.runtime.compile_package(
+                payload["selection"],
+                source_ref=payload.get("source_ref", "service"),
+            )
+        if operation == "lifecycle.apply":
+            return self.runtime.apply_lifecycle_actions(
+                payload["actions"],
+                approval_marker=str(payload.get("approval_marker") or ""),
+                review_record=payload.get("review_record"),
+                source_ref=payload.get("source_ref", "service"),
+            )
+        if operation == "audit.query":
+            return self.runtime.query_audit(
+                dict(payload.get("filters") or {}),
+                source_ref=payload.get("source_ref", "service"),
             )
         raise ValueError(f"Unsupported service operation: {operation}")
 
@@ -433,6 +453,7 @@ def assert_policy_gateway_conformance(gateway) -> None:
 def assert_audit_sink_conformance(sink) -> None:
     receipt = sink.record({"operation": "conformance"})
     assert receipt.get("recorded") is True
+    assert sink.query(operation="conformance")[0]["operation"] == "conformance"
 
 
 def assert_semantic_index_conformance(index) -> None:
diff --git a/tests/fixtures/evaluation-scenarios.json b/tests/fixtures/evaluation-scenarios.json
new file mode 100644
index 0000000..926297a
--- /dev/null
+++ b/tests/fixtures/evaluation-scenarios.json
@@ -0,0 +1,170 @@
+{
+  "schema_version": "phase_memory.evaluation_scenarios.v1",
+  "scenarios": [
+    {
+      "id": "policy-denied-activation",
+      "profile": {
+        "schema_version": "markitect.memory.profile.v1",
+        "id": "eval-policy-profile",
+        "memory_kinds": ["knowledge", "decision"],
+        "activation": {"max_items": 4, "max_tokens": 60},
+        "policy": {"mode": "allow-all", "trust_zone_labels": ["local"]},
+        "observability": {"audit_sink": "recording"}
+      },
+      "graph": {
+        "schema_version": "markitect.memory.graph.v1",
+        "id": "eval-policy-graph",
+        "nodes": [
+          {
+            "id": "policy.public",
+            "kind": "knowledge",
+            "text": "Public operating constraint that can be activated for local planning.",
+            "phase": "stabilized",
+            "policy": {"labels": ["public"], "trust_zone": "local"},
+            "source_spans": [{"path": "policy.md", "line_start": 1}],
+            "metadata": {"graph_id": "eval-policy-graph"}
+          },
+          {
+            "id": "policy.secret",
+            "kind": "knowledge",
+            "text": "Sensitive credential note that must not enter restart context.",
+            "phase": "stabilized",
+            "policy": {"labels": ["restricted"], "trust_zone": "local", "secret": true},
+            "metadata": {"graph_id": "eval-policy-graph"}
+          }
+        ],
+        "edges": [
+          {
+            "id": "edge.policy",
+            "kind": "references",
+            "source": "policy.public",
+            "target": "policy.secret"
+          }
+        ],
+        "events": []
+      },
+      "expect": {"denied_node_ids": ["policy.secret"]}
+    },
+    {
+      "id": "profile-lifecycle-rules",
+      "profile": {
+        "schema_version": "markitect.memory.profile.v1",
+        "id": "eval-lifecycle-profile",
+        "memory_kinds": ["episode", "decision"],
+        "retention": {
+          "episode": {"stale_after_days": 7},
+          "decision": {"delete_after_days": 365}
+        },
+        "refresh": {"mode": "enabled"},
+        "compaction": {"node_ids": ["life.old-episode"]},
+        "metadata": {
+          "phase_transitions": [
+            {
+              "node_kind": "decision",
+              "from_phase": "fluid",
+              "to_phase": "stabilized",
+              "min_age_days": 2,
+              "reason": "decision has stabilized"
+            }
+          ]
+        }
+      },
+      "graph": {
+        "schema_version": "markitect.memory.graph.v1",
+        "id": "eval-lifecycle-graph",
+        "nodes": [
+          {
+            "id": "life.old-episode",
+            "kind": "episode",
+            "text": "An old episode ready to become stale and compacted.",
+            "phase": "fluid",
+            "freshness": {"updated_at": "2026-04-01T00:00:00+00:00", "source_digest": "old"},
+            "metadata": {"graph_id": "eval-lifecycle-graph"}
+          },
+          {
+            "id": "life.decision",
+            "kind": "decision",
+            "text": "A decision that should transition to stabilized after review.",
+            "phase": "fluid",
+            "freshness": {"updated_at": "2026-05-01T00:00:00+00:00", "source_digest": "decision-old"},
+            "metadata": {"graph_id": "eval-lifecycle-graph"}
+          }
+        ],
+        "edges": [],
+        "events": []
+      },
+      "expect": {
+        "actions": [
+          ["life.old-episode", "mark_stale"],
+          ["life.decision", "transition_phase"],
+          ["life.decision", "refresh"]
+        ],
+        "compact_source": "life.old-episode"
+      }
+    },
+    {
+      "id": "budget-path-and-semantic-hints",
+      "profile": {
+        "schema_version": "markitect.memory.profile.v1",
+        "id": "eval-budget-profile",
+        "memory_kinds": ["decision", "knowledge", "episode"],
+        "activation": {"max_items": 2, "max_tokens": 16, "semantic_index": "memory"}
+      },
+      "graph": {
+        "schema_version": "markitect.memory.graph.v1",
+        "id": "eval-budget-graph",
+        "nodes": [
+          {
+            "id": "budget.anchor",
+            "kind": "decision",
+            "text": "Restart anchor with source.",
+            "phase": "stabilized",
+            "source_spans": [{"path": "restart.md", "line_start": 3}],
+            "metadata": {"graph_id": "eval-budget-graph"}
+          },
+          {
+            "id": "budget.semantic",
+            "kind": "knowledge",
+            "text": "Semantic index hint for restart package selection.",
+            "phase": "stabilized",
+            "source_spans": [{"path": "retrieval.md", "line_start": 7}],
+            "metadata": {"graph_id": "eval-budget-graph"}
+          },
+          {
+            "id": "budget.long",
+            "kind": "episode",
+            "text": "This verbose episode is intentionally long enough to lose against the strict activation token budget pressure.",
+            "phase": "fluid",
+            "metadata": {"graph_id": "eval-budget-graph"}
+          }
+        ],
+        "edges": [
+          {
+            "id": "edge.budget",
+            "kind": "supports",
+            "source": "budget.anchor",
+            "target": "budget.semantic"
+          }
+        ],
+        "events": [
+          {
+            "id": "budget.path-event",
+            "kind": "activated",
+            "timestamp": "2026-05-18T00:00:00+00:00",
+            "activation_refs": ["activation.budget"]
+          }
+        ]
+      },
+      "path": {
+        "id": "path.budget",
+        "event_ids": ["budget.path-event"]
+      },
+      "expect": {
+        "selected_node_ids": ["budget.anchor", "budget.semantic"],
+        "omitted_node_ids": ["budget.long"],
+        "semantic_top_id": "budget.semantic",
+        "event_ids": ["budget.path-event"]
+      }
+    }
+  ]
+}
diff --git a/tests/test_evaluation_scenarios.py b/tests/test_evaluation_scenarios.py
new file mode 100644
index 0000000..13115f8
--- /dev/null
+++ b/tests/test_evaluation_scenarios.py
@@ -0,0 +1,101 @@
+import json
+from datetime import datetime, timezone
+from pathlib import Path
+
+from phase_memory.adapters import InMemorySemanticIndex
+from phase_memory.contracts import graph_from_markitect
+from phase_memory.models import ActivationPlan, MemoryPath
+from phase_memory.retrieval import activation_quality_report, select_event_path
+from phase_memory.runtime import PhaseMemoryRuntime
+
+
+FIXTURES = Path(__file__).parent / "fixtures"
+
+
+def _scenarios():
+    data = json.loads((FIXTURES / "evaluation-scenarios.json").read_text(encoding="utf-8"))
+    return {scenario["id"]: scenario for scenario in data["scenarios"]}
+
+
+def test_policy_denied_activation_scenario_is_redacted_and_audited() -> None:
+    scenario = _scenarios()["policy-denied-activation"]
+    runtime = PhaseMemoryRuntime()
+
+    response = runtime.plan_activation(
+        scenario["graph"],
+        max_items=4,
+        max_tokens=60,
+        profile_id=scenario["profile"]["id"],
+        policy_context={"denied_labels": ["restricted"], "secrets_allowed": False, "trust_zone": "local"},
+    )
+    audit = runtime.query_audit({"operation": "graph.activation.plan"})
+
+    denied_ids = [item["id"] for item in response["data"]["policy_denials"]]
+    assert response["valid"] is True
+    assert denied_ids == scenario["expect"]["denied_node_ids"]
+    assert response["data"]["policy_denials"][0]["text"] == "[REDACTED]"
+    assert [diagnostic["code"] for diagnostic in response["diagnostics"]] == ["activation_policy_denied"]
+    assert audit["count"] == 1
+
+
+def test_profile_lifecycle_rules_scenario_emits_expected_actions() -> None:
+    scenario = _scenarios()["profile-lifecycle-rules"]
+    runtime = PhaseMemoryRuntime()
+
+    response = runtime.plan_lifecycle_with_profile(
+        scenario["profile"],
+        scenario["graph"],
+        refresh_digests={"life.decision": "decision-new"},
+        now=datetime(2026, 5, 18, tzinfo=timezone.utc),
+    )
+
+    actions = [(action["target_id"], action["action"]) for action in response["data"]["dry_run_actions"]]
+    compact_actions = [action for action in response["data"]["dry_run_actions"] if action["action"] == "compact"]
+    assert response["valid"] is True
+    for expected in scenario["expect"]["actions"]:
+        assert tuple(expected) in actions
+    assert compact_actions[0]["metadata"]["source_node_ids"] == [scenario["expect"]["compact_source"]]
+
+
+def test_budget_path_and_semantic_hint_scenario_meets_quality_thresholds() -> None:
+    scenario = _scenarios()["budget-path-and-semantic-hints"]
+    graph = graph_from_markitect(scenario["graph"]).value
+    runtime = PhaseMemoryRuntime()
+    index = InMemorySemanticIndex()
+
+    index.upsert_nodes(list(graph.nodes))
+    response = runtime.plan_activation(
+        scenario["graph"],
+        max_items=scenario["profile"]["activation"]["max_items"],
+        max_tokens=scenario["profile"]["activation"]["max_tokens"],
+        profile_id=scenario["profile"]["id"],
+        priority_node_ids=tuple(scenario["expect"]["selected_node_ids"]),
+    )
+    path = MemoryPath.from_mapping(scenario["path"])
+    selected_path_events = select_event_path(graph.events, path, max_events=2)
+    semantic_results = index.query(graph_id=graph.graph_id, query="semantic restart", limit=2)
+    report = activation_quality_report(_activation_plan(response), expected_node_ids=tuple(scenario["expect"]["selected_node_ids"]))
+
+    plan = response["data"]["activation_plan"]
+    assert plan["selected_node_ids"] == scenario["expect"]["selected_node_ids"]
+    assert [item["id"] for item in plan["omitted"]] == scenario["expect"]["omitted_node_ids"]
+    assert selected_path_events == tuple(scenario["expect"]["event_ids"])
+    assert semantic_results[0]["id"] == scenario["expect"]["semantic_top_id"]
+    assert report["source_span_coverage"] == 1.0
+    assert report["explanation_coverage"] == 1.0
+
+
+def _activation_plan(response):
+    data = response["data"]["activation_plan"]
+    return ActivationPlan(
+        plan_id=data["plan_id"],
+        graph_id=data["graph_id"],
+        selected_node_ids=tuple(data["selected_node_ids"]),
+        selected_event_ids=tuple(data["selected_event_ids"]),
+        omitted=tuple(data["omitted"]),
+        token_estimate=data["token_estimate"],
+        max_items=data["max_items"],
+        max_tokens=data["max_tokens"],
+        selection=response["data"]["package_request"]["selection"],
+        diagnostics=(),
+    )
diff --git a/tests/test_external_adapter_packs.py b/tests/test_external_adapter_packs.py
index 10ef319..e22c752 100644
--- a/tests/test_external_adapter_packs.py
+++ b/tests/test_external_adapter_packs.py
@@ -1,7 +1,14 @@
 import json
 from pathlib import Path
 
-from phase_memory.external_adapters import fake_external_adapter_pack, fake_external_runtime_config
+from phase_memory.external_adapters import (
+    ADAPTER_PACK_MANIFEST_SCHEMA,
+    ExternalAdapterPack,
+    adapter_pack_manifest,
+    fake_external_adapter_pack,
+    fake_external_runtime_config,
+    validate_adapter_pack_manifest,
+)
 from phase_memory.service import (
     assert_audit_sink_conformance,
     assert_context_compiler_conformance,
@@ -37,6 +44,35 @@ def test_fake_external_adapter_pack_satisfies_public_conformance_helpers() -> No
     assert pack.to_dict()["adapters"]["package_compiler"] == "FakeMarkitectPackageCompiler"
 
 
+def test_fake_external_adapter_pack_manifest_declares_compatibility() -> None:
+    pack = fake_external_adapter_pack()
+
+    manifest = adapter_pack_manifest(pack)
+    diagnostics = validate_adapter_pack_manifest(pack)
+
+    assert manifest["schema_version"] == ADAPTER_PACK_MANIFEST_SCHEMA
+    assert manifest["adapters"]["package_compiler"]["required_conformance"] == "assert_context_compiler_conformance"
+    assert manifest["adapters"]["audit_sink"]["required_capabilities"] == ["telemetry.audit.fake"]
+    assert diagnostics == ()
+
+
+def test_adapter_pack_manifest_reports_missing_capabilities() -> None:
+    pack = fake_external_adapter_pack()
+    incomplete = ExternalAdapterPack(
+        name=pack.name,
+        adapters=pack.adapters,
+        capabilities=tuple(capability for capability in pack.capabilities if capability != "telemetry.audit.fake"),
+        ownership_boundaries=pack.ownership_boundaries,
+        required_conformance=pack.required_conformance,
+        metadata=pack.metadata,
+    )
+
+    diagnostics = validate_adapter_pack_manifest(incomplete)
+
+    assert [diagnostic.code for diagnostic in diagnostics] == ["missing_adapter_capability"]
+    assert diagnostics[0].metadata["capability"] == "telemetry.audit.fake"
+
+
 def test_external_runtime_config_resolves_supplied_fake_pack() -> None:
     config = fake_external_runtime_config()
     pack = fake_external_adapter_pack()
diff --git a/tests/test_file_backed_runtime.py b/tests/test_file_backed_runtime.py
index 06f147d..9a65f0e 100644
--- a/tests/test_file_backed_runtime.py
+++ b/tests/test_file_backed_runtime.py
@@ -87,6 +87,44 @@ def test_repair_diagnostics_report_missing_edges_and_orphaned_path_events(tmp_pa
     assert [diagnostic["code"] for diagnostic in envelope["diagnostics"]] == ["missing_edge_target", "orphaned_path_event"]
 
 
+def test_file_backed_store_reports_migration_needs_and_uses_atomic_json_writes(tmp_path) -> None:
+    store = FileBackedMemoryGraphStore(tmp_path)
+    metadata_path = tmp_path / "phase-memory.json"
+    metadata_path.write_text(
+        json.dumps(
+            {
+                "schema_version": "phase_memory.local_store.v0",
+                "planned_migrations": ["v0-to-v1"],
+            }
+        ),
+        encoding="utf-8",
+    )
+
+    store.save_node(MemoryNode("node.atomic", "decision", "Atomic write target"))
+    runtime = PhaseMemoryRuntime(graph_store=store, event_log=JsonlMemoryEventLog(tmp_path / "events.jsonl"))
+
+    envelope = runtime.repair_diagnostics(source_ref=str(tmp_path))
+
+    codes = [diagnostic["code"] for diagnostic in envelope["diagnostics"]]
+    assert envelope["valid"] is True
+    assert "store_migration_required" in codes
+    assert "planned_store_migrations" in codes
+    assert not list(tmp_path.rglob("*.tmp"))
+
+
+def test_repair_diagnostics_distinguish_corrupt_store_records(tmp_path) -> None:
+    store = FileBackedMemoryGraphStore(tmp_path)
+    runtime = PhaseMemoryRuntime(graph_store=store, event_log=JsonlMemoryEventLog(tmp_path / "events.jsonl"))
+
+    (tmp_path / "nodes" / "broken.json").write_text("{not-json}\n", encoding="utf-8")
+
+    envelope = runtime.repair_diagnostics(source_ref=str(tmp_path))
+
+    assert envelope["valid"] is False
+    assert envelope["diagnostics"][0]["code"] == "corrupt_store_record"
+    assert envelope["diagnostics"][0]["metadata"]["record_type"] == "node"
+
+
 def test_lifecycle_apply_requires_approval_for_reviewable_actions(tmp_path) -> None:
     store = FileBackedMemoryGraphStore(tmp_path)
     runtime = PhaseMemoryRuntime(graph_store=store, event_log=JsonlMemoryEventLog(tmp_path / "events.jsonl"))
diff --git a/tests/test_service_readiness.py b/tests/test_service_readiness.py
index 8072e16..fa45158 100644
--- a/tests/test_service_readiness.py
+++ b/tests/test_service_readiness.py
@@ -1,7 +1,8 @@
 import json
 from pathlib import Path
 
-from phase_memory.models import LifecycleState, MemoryNode
+from phase_memory.lifecycle import plan_compaction
+from phase_memory.models import LifecycleAction, LifecycleActionKind, LifecycleState, MemoryNode
 from phase_memory.service import (
     HEALTH_REPORT_SCHEMA,
     KONTEXTUAL_DELEGATION_SCHEMA,
@@ -76,6 +77,58 @@ def test_service_runner_handles_profile_driven_lifecycle_plan() -> None:
     assert ("event.restart", "refresh") in actions
 
 
+def test_service_runner_handles_package_compile_and_audit_query() -> None:
+    runner = LocalServiceRunner()
+    selection = {
+        "schema_version": "markitect.memory.selection.v1",
+        "id": "selection.service",
+        "nodes": ["decision.boundary"],
+        "events": ["event.activation"],
+    }
+
+    compiled = runner.handle("package.compile", {"selection": selection, "source_ref": "service-test"})
+    audit = runner.handle("audit.query", {"filters": {"operation": "package.compile"}})
+
+    assert compiled["operation"] == "package.compile"
+    assert compiled["data"]["package_response"]["package_ref"] == "package:selection.service"
+    assert audit["operation"] == "audit.query"
+    assert audit["count"] == 1
+    assert audit["events"][0]["source"]["ref"] == "service-test"
+    assert audit["retention"]["mode"] == "in_memory"
+
+
+def test_service_runner_handles_review_gated_lifecycle_apply() -> None:
+    runner = LocalServiceRunner()
+    node = runner.runtime.graph_store.save_node(MemoryNode("node.review", "episode", "Review gated content"))
+    compact = plan_compaction([node]).to_dict()
+
+    denied = runner.handle("lifecycle.apply", {"actions": [compact]})
+    applied = runner.handle("lifecycle.apply", {"actions": [compact], "approval_marker": "review:service"})
+    audit = runner.handle("audit.query", {"filters": {"operation": "lifecycle.apply", "dry_run": False}})
+
+    assert denied["valid"] is False
+    assert denied["data"]["denied"][0]["reason"] == "review_required"
+    assert applied["valid"] is True
+    assert runner.runtime.graph_store.get_node(applied["data"]["applied"][0]["target_id"]).kind == "summary"
+    assert audit["count"] == 2
+
+
+def test_service_runner_handles_non_review_lifecycle_apply() -> None:
+    runner = LocalServiceRunner()
+    runner.runtime.graph_store.save_node(MemoryNode("node.stale.service", "episode"))
+    action = LifecycleAction(
+        LifecycleActionKind.MARK_STALE,
+        "node.stale.service",
+        from_state=LifecycleState.ACTIVE,
+        to_state=LifecycleState.STALE,
+    )
+
+    applied = runner.handle("lifecycle.apply", {"actions": [action.to_dict()]})
+
+    assert applied["valid"] is True
+    assert runner.runtime.graph_store.get_node("node.stale.service").lifecycle == LifecycleState.STALE
+
+
 def test_profile_driven_runtime_config_resolves_file_backed_adapters(tmp_path) -> None:
     config = RuntimeConfig.from_profile(
         {
diff --git a/workplans/PMEM-WP-0011-refinement-hardening-and-operational-readiness.md b/workplans/PMEM-WP-0011-refinement-hardening-and-operational-readiness.md
index a20bdce..7e1bd34 100644
--- a/workplans/PMEM-WP-0011-refinement-hardening-and-operational-readiness.md
+++ b/workplans/PMEM-WP-0011-refinement-hardening-and-operational-readiness.md
@@ -4,7 +4,7 @@ type: workplan
 title: "Refinement Hardening And Operational Readiness"
 domain: markitect
 repo: phase-memory
-status: ready
+status: finished
 owner: codex
 topic_slug: phase-memory
 created: "2026-05-18"
@@ -34,7 +34,7 @@ The repo now has:
 - fake external adapter packs.
 
 The refined scorecard in `docs/maturity-scorecard.md` scores the project at
-**3.8 / 5** overall, with stronger local integration maturity than operational
+**4.0 / 5** overall, with stronger local integration maturity than operational
 maturity.
 
 ## Non-Goals
@@ -48,7 +48,7 @@ maturity.
 
 ```task
 id: PMEM-WP-0011-T01
-status: todo
+status: done
 priority: high
 state_hub_task_id: "2b3c6eb4-8d3f-4c73-ab53-74e1bed8b93f"
 ```
@@ -68,7 +68,7 @@ Acceptance:
 
 ```task
 id: PMEM-WP-0011-T02
-status: todo
+status: done
 priority: high
 state_hub_task_id: "2c19cfb0-e147-40b8-b964-6c617bddb90e"
 ```
@@ -86,7 +86,7 @@ Acceptance:
 
 ```task
 id: PMEM-WP-0011-T03
-status: todo
+status: done
 priority: high
 state_hub_task_id: "cdce1c6a-4581-4184-87c6-f7bec6c3fcbd"
 ```
@@ -105,7 +105,7 @@ Acceptance:
 
 ```task
 id: PMEM-WP-0011-T04
-status: todo
+status: done
 priority: medium
 state_hub_task_id: "602c22bb-d440-4d38-a51f-bf6ed504fd1e"
 ```
@@ -123,7 +123,7 @@ Acceptance:
 
 ```task
 id: PMEM-WP-0011-T05
-status: todo
+status: done
 priority: medium
 state_hub_task_id: "c4fa6001-b20c-4ec1-b885-af9b80c832de"
 ```
@@ -140,7 +140,7 @@ Acceptance:
 
 ```task
 id: PMEM-WP-0011-T06
-status: todo
+status: done
 priority: medium
 state_hub_task_id: "f4674eaf-cbc1-4eac-b1d1-b07ae51289cf"
 ```
@@ -162,4 +162,24 @@ Acceptance:
 
 ## Closure Review
 
-Pending implementation.
+Completed on 2026-05-18.
+
+Implemented:
+
+- Full `LocalServiceRunner` handling for every `SERVICE_OPERATIONS` entry.
+- Runtime and service audit queries with queryable recording/JSONL/fake audit
+  sinks and retention metadata.
+- Review-gated `lifecycle.apply` and `package.compile` service coverage.
+- Atomic JSON writes for file-backed store records plus metadata migration,
+  planned-migration, corrupt-record, missing-reference, and orphaned-path
+  diagnostics.
+- Three evaluation scenario families covering policy-denied activation,
+  profile lifecycle rules, event-path activation, semantic-index hints, and
+  budget pressure.
+- Adapter pack compatibility manifests and explicit missing-capability
+  diagnostics.
+- Operational readiness docs and scorecard update from 3.8 to 4.0.
+
+Verification:
+
+- `python3 -m pytest` passes with 70 tests.
diff --git a/workplans/PMEM-WP-0012-live-adapter-and-service-binding-readiness.md b/workplans/PMEM-WP-0012-live-adapter-and-service-binding-readiness.md
new file mode 100644
index 0000000..1d12a8c
--- /dev/null
+++ b/workplans/PMEM-WP-0012-live-adapter-and-service-binding-readiness.md
@@ -0,0 +1,152 @@
+---
+id: PMEM-WP-0012
+type: workplan
+title: "Live Adapter And Service Binding Readiness"
+domain: markitect
+repo: phase-memory
+status: ready
+owner: codex
+topic_slug: phase-memory
+created: "2026-05-18"
+updated: "2026-05-18"
+state_hub_workstream_id: "427b91ad-9df1-4053-aeb0-54ee39b6bf62"
+---
+
+# PMEM-WP-0012: Live Adapter And Service Binding Readiness
+
+## Goal
+
+Move phase-memory from local integration readiness toward operational
+readiness by adding deployable service bindings, executable migration behavior,
+and live-shaped adapter compatibility gates while preserving the dependency-light
+local runtime.
+
+## Current Evidence
+
+`PMEM-WP-0011` brought the scorecard to **4.0 / 5** by closing local service
+runner parity, queryable audit behavior, persistence diagnostics, multi-scenario
+evaluation fixtures, adapter pack manifests, and operational recipes.
+
+## Non-Goals
+
+- Require live external credentials in default tests.
+- Make a specific web framework mandatory for library users.
+- Move Markitect or Kontextual ownership into this repo.
+- Replace deterministic fake adapters with network-dependent defaults.
+
+## T01 - Add optional service binding
+
+```task
+id: PMEM-WP-0012-T01
+status: todo
+priority: high
+state_hub_task_id: "1244aabb-b8a3-4053-8454-499e8772f5bf"
+```
+
+Add an optional framework binding or adapter shell around `LocalServiceRunner`
+for health, readiness, and operation dispatch.
+
+Acceptance:
+
+- Binding preserves the framework-neutral `LocalServiceRunner` API.
+- Health/readiness endpoints cover config diagnostics and adapter availability.
+- Tests run without starting a network listener by default.
+
+## T02 - Add executable local-store migrations
+
+```task
+id: PMEM-WP-0012-T02
+status: todo
+priority: high
+state_hub_task_id: "b8d3e7a0-c538-4d6c-b2f8-7c33b17c850a"
+```
+
+Turn migration diagnostics into explicit migration planning and apply behavior.
+
+Acceptance:
+
+- Store metadata can produce deterministic migration plans.
+- Migration apply updates metadata atomically and records an audit event.
+- Tests cover no-op, old-schema, planned-migration, and corrupt-metadata paths.
+
+## T03 - Add live-shaped adapter compatibility fixtures
+
+```task
+id: PMEM-WP-0012-T03
+status: todo
+priority: high
+state_hub_task_id: "e385af31-13f2-4be0-8fcf-89586e2d3954"
+```
+
+Add adapter fixtures that model Markitect and Kontextual live behavior behind
+the same manifest and conformance helpers used by fake packs.
+
+Acceptance:
+
+- Adapter manifest validation covers fake and live-shaped packs.
+- Capability and ownership failures remain explicit diagnostics.
+- The runtime can resolve live-shaped packs without changing local code paths.
+
+## T04 - Add audit retention and telemetry export drills
+
+```task
+id: PMEM-WP-0012-T04
+status: todo
+priority: medium
+state_hub_task_id: "d203294a-bf5a-43d0-a88c-086a3406940d"
+```
+
+Make audit retention policy and telemetry export inspectable beyond metadata.
+
+Acceptance:
+
+- Audit sinks expose retention eligibility or pruning plans.
+- Telemetry export emits deterministic local event batches.
+- Tests cover review-gated apply, policy denial, and package compile traces.
+
+## T05 - Grow evaluation threshold reporting
+
+```task
+id: PMEM-WP-0012-T05
+status: todo
+priority: medium
+state_hub_task_id: "305729e2-23ff-4043-9356-0df83f8e6d7b"
+```
+
+Promote the evaluation scenarios into a threshold report suitable for regression
+tracking.
+
+Acceptance:
+
+- Evaluation report includes policy, lifecycle, path, semantic, and budget
+  metrics.
+- Threshold assertions produce actionable diagnostics.
+- Fixture additions do not require live dependencies.
+
+## T06 - Add public API compatibility checks
+
+```task
+id: PMEM-WP-0012-T06
+status: todo
+priority: medium
+state_hub_task_id: "78f9d0d8-dc9d-4f43-a32d-92e17b3c5122"
+```
+
+Protect the embedding surface now documented as stable.
+
+Acceptance:
+
+- Public exports have a compatibility snapshot or explicit changelog gate.
+- Service operation catalog and local runner handlers stay in parity by test.
+- Docs identify how breaking changes should be handled.
+
+## Acceptance Criteria
+
+- Scorecard has concrete evidence toward the 4.3+ gate.
+- Optional operational surfaces stay optional and dependency-light by default.
+- Live-shaped adapters can be validated by the same compatibility contract as
+  fake packs.
+
+## Closure Review
+
+Pending implementation.