diff --git a/docs/maturity-scorecard.md b/docs/maturity-scorecard.md index 67a68c5..ba22e55 100644 --- a/docs/maturity-scorecard.md +++ b/docs/maturity-scorecard.md @@ -26,24 +26,25 @@ to 5. ## Current Score -Overall maturity: **4.3 / 5** +Overall maturity: **4.4 / 5** Two sub-scores make the result easier to reason about: -- Local integration maturity: **4.6 / 5** -- Operational maturity: **4.0 / 5** +- Local integration maturity: **4.7 / 5** +- Operational maturity: **4.2 / 5** The repo is strong as a deterministic local library and service-boundary core. -It is not yet production-operational because adapter coverage is still -credential-gated rather than continuously exercised against live services, and -service packaging is stdlib/local rather than deployed to a managed environment. +It now has credential-safe operator artifacts, managed deployment manifest +validation, persisted evaluation trend histories, and a troubleshooting matrix. +It is not yet production-operational because real endpoint and managed platform +evidence still requires an approved operator environment. ## Dimension Scorecard | Dimension | Score | Target | Evidence | Needed Next | | --- | ---: | ---: | --- | --- | | Intent and boundaries | 4.4 | 5.0 | `INTENT.md`, `SCOPE.md`, `README.md`, architecture docs, adjacent-repo boundary docs | Keep docs current as live adapters and service bindings clarify real ownership. | -| Package and API foundation | 4.6 | 4.8 | Python package, public exports, runtime facade, CLI, service runner export, service config, dependency-light tests, public API snapshot, release-note template | Add compatibility migration examples from a real release. | +| Package and API foundation | 4.7 | 4.8 | Python package, public exports, runtime facade, CLI, service runner export, service config, deployment/troubleshooting helpers, dependency-light tests, public API snapshot, release-note template | Add compatibility migration examples from a real release. | | Markitect profile contract ingress | 3.7 | 4.5 | Profile loading, diagnostics, runtime envelopes, profile-derived config, local alias normalization | Add richer compatibility fixtures and schema drift diagnostics. | | Graph and event ingress | 4.0 | 4.5 | Graph loading, endpoint diagnostics, event model, JSONL log, export, repair checks, corrupt-record diagnostics, fake and live-shaped graph/event adapters | Add broader malformed/large graph fixtures and operator repair utilities. | | Phase domain model | 3.5 | 4.5 | Phases, lifecycle states, actions, paths, retention rules, profile-derived transition rules | Add migration semantics for profile/rule changes over durable stores. | @@ -51,13 +52,13 @@ service packaging is stdlib/local rather than deployed to a managed environment. | Lifecycle planning and apply | 4.1 | 4.5 | Dry-run lifecycle plans, profile rules, review-gated local apply, service `lifecycle.apply`, apply audit/export queries | Add richer apply rollback and repair drills. | | Activation planning | 4.0 | 4.8 | Budgeted activation, selections, package request, graph neighborhoods, paths, ranking, metrics, multi-scenario evaluation fixtures | Wire semantic-index-assisted retrieval into runtime planning. | | Local persistence | 4.0 | 4.5 | File-backed graph store, JSONL event log, audit sink, atomic JSON writes, executable metadata migrations, migration audit, export, repair diagnostics | Add compaction/retention utilities and stronger corruption recovery. | -| Policy, review, and audit | 4.4 | 5.0 | Operation points, review records, audit schema, queryable/exportable audit sinks, retention plans and apply, denials, redaction, fake/live-shaped policy/audit adapters | Add live policy adapter boundary and credentialed telemetry pruning drill. | -| Observability and operations | 4.3 | 4.8 | Health report, readiness report, config diagnostics, adapter status, service binding, stdlib service entrypoint, operator runbook, fake/live-shaped telemetry audit sinks | Add metrics/event export to external telemetry and managed deployment packaging. | -| Markitect interop | 4.1 | 4.5 | Local validation, package request/response envelopes, fake/live-shaped compiler fixtures, credential-gated drill contract | Add credentialed Markitect compiler execution and schema drift suite. | -| Kontextual/Infospace interop | 3.9 | 4.5 | Delegation envelope, fake/live-shaped runtime registry, credential-gated drill contract, activation quality report fixture, adapter compatibility manifests | Add credentialed Kontextual execution and broader Infospace restart reports. | -| Testing and evaluation | 4.5 | 4.7 | Deterministic tests over runtime, CLI, adapters, policy, activation, lifecycle, service, fakes, live-shaped packs, credential skip gates, API snapshots, evaluation threshold and trend reports | Add larger regression corpus and persisted trend history. | -| Service readiness | 4.6 | 4.8 | Service contracts, full local runner parity, framework-neutral service binding, WSGI adapter, stdlib service entrypoint, health/readiness, config, adapter conformance | Add managed deployment packaging. | -| Developer experience | 4.5 | 4.7 | README, package map, CLI examples, persistence/policy/interop/service/lifecycle/fake-pack docs, operational recipe, operator runbook, API compatibility docs, release-note template | Add troubleshooting matrix from real operator feedback. | +| Policy, review, and audit | 4.5 | 5.0 | Operation points, review records, audit schema, queryable/exportable audit sinks, retention plans and apply, denials, redaction, fake/live-shaped policy/audit adapters, credential-safe telemetry retention drill | Add live policy adapter boundary and external telemetry pruning evidence. | +| Observability and operations | 4.5 | 4.8 | Health report, readiness report, config diagnostics, adapter status, service binding, stdlib service entrypoint, managed deployment manifest validation, operator runbook, fake/live-shaped telemetry audit sinks | Pilot the managed package in an operator deployment target. | +| Markitect interop | 4.2 | 4.5 | Local validation, package request/response envelopes, fake/live-shaped compiler fixtures, credential-gated drill contract, redacted operator reports | Add credentialed Markitect compiler execution and schema drift suite. | +| Kontextual/Infospace interop | 4.0 | 4.5 | Delegation envelope, fake/live-shaped runtime registry, credential-gated drill contract, redacted operator reports, activation quality report fixture, adapter compatibility manifests | Add credentialed Kontextual execution and broader Infospace restart reports. | +| Testing and evaluation | 4.6 | 4.7 | Deterministic tests over runtime, CLI, adapters, policy, activation, lifecycle, service, fakes, live-shaped packs, credential skip gates, API snapshots, evaluation threshold/trend reports, persisted trend history | Add larger regression corpus and make trend history a release gate. | +| Service readiness | 4.7 | 4.8 | Service contracts, full local runner parity, framework-neutral service binding, WSGI adapter, stdlib service entrypoint, health/readiness, config, adapter conformance, managed deployment manifest validation | Pilot managed deployment packaging on the target platform. | +| Developer experience | 4.6 | 4.7 | README, package map, CLI examples, persistence/policy/interop/service/lifecycle/fake-pack docs, operational recipe, operator runbook, API compatibility docs, release-note template, troubleshooting matrix | Refine troubleshooting from real operator feedback. | ## Assessment @@ -68,9 +69,10 @@ and live-shaped external pack manifests, credential-gated drills, service binding and stdlib entrypoint, API snapshots, release discipline, and conformance helpers form a solid integration boundary. -The biggest optimization opportunity is now the next operational layer: -running the credential-gated drills against real services, adding managed -deployment packaging, and growing evaluation trends into a historical corpus. +The biggest optimization opportunity is now evidence, not scaffolding: +run the credentialed reports against real services, pilot the managed manifest +on a target platform, and make persisted trend history part of the operator +release gate. ## Completed Refinement Workplan @@ -108,19 +110,30 @@ deployment packaging, and growing evaluation trends into a historical corpus. - evaluation trend artifacts with threshold and regression deltas; - release-note template gating for public API snapshot changes. +`PMEM-WP-0014` moved the score from 4.3 to 4.4 by adding: + +- credential-safe operator reports with token and endpoint redaction; +- credentialed telemetry retention drill coverage through live-shaped or + operator-approved fixture paths; +- managed deployment manifest generation and validation for service entrypoint, + probes, rollback, replicas, and local-store mounts; +- deterministic persisted evaluation trend history; +- operator troubleshooting matrix coverage for credential, readiness, + migration, audit retention, and adapter-manifest failures. + ## Recommended Next Refinement -Create and execute `PMEM-WP-0014`: live credential execution and managed -deployment hardening. +Create and execute `PMEM-WP-0015`: credentialed live pilot and deployment +evidence. Highest-value tasks: -- Run the credential-gated drills against real Markitect/Kontextual endpoints - in an operator environment. -- Add managed deployment packaging and readiness probes. -- Persist evaluation trend reports across runs. -- Add credentialed telemetry export and retention pruning drills. -- Expand troubleshooting from actual operator feedback. +- Run the redacted credentialed report against real Markitect/Kontextual + endpoints in an operator environment. +- Pilot the managed deployment manifest on the target platform. +- Capture external telemetry retention evidence. +- Promote trend history into a release/regression gate. +- Refine troubleshooting from actual operator feedback. ## Score Movement Gates @@ -139,6 +152,15 @@ Achieved overall score **4.3+** when: - Operational docs include deployable service packaging and an operator readiness runbook. +Achieved overall score **4.4+** when: + +- Credentialed operator report artifacts redact credential values and endpoint + URLs. +- Managed deployment manifest validation covers service entrypoint, probes, + rollback, replicas, and store mounts. +- Evaluation trend artifacts can be persisted into deterministic history. +- Troubleshooting docs map common operator diagnostics to actions. + Move overall score to **4.7+** only when: - Live adapter behavior, telemetry, audit retention, migration, and evaluation diff --git a/docs/operator-readiness-runbook.md b/docs/operator-readiness-runbook.md index d4e6ff2..1914710 100644 --- a/docs/operator-readiness-runbook.md +++ b/docs/operator-readiness-runbook.md @@ -130,6 +130,72 @@ python3 -m pytest tests/test_credentialed_drills.py The report redacts tokens and uses a credential fingerprint rather than persisting secrets. +Persist a redacted operator report from the same environment: + +```python +from phase_memory import write_credentialed_operator_report + +write_credentialed_operator_report("reports/credentialed-operator-report.json") +``` + +Run the credentialed telemetry retention drill when an operator has approved +using the local fixture path or the required credentials are present: + +```python +from phase_memory import credentialed_telemetry_retention_drill + +report = credentialed_telemetry_retention_drill(operator_approved_fixture=True) +``` + +The drill records old and new audit events, plans retention, applies pruning, +and reports retained/pruned operation ids without storing credential values. + +## Managed Deployment Manifest + +Build and validate a deployment manifest before handing it to platform-specific +packaging: + +```python +from phase_memory import managed_deployment_manifest, validate_managed_deployment_manifest +from phase_memory import ServiceAppConfig + +manifest = managed_deployment_manifest( + ServiceAppConfig(host="0.0.0.0", port=8080, local_store_path="/var/lib/phase-memory") +) +validation = validate_managed_deployment_manifest(manifest) +``` + +Required manifest features: + +- `phase-memory-service` command entrypoint; +- `/health` liveness probe; +- `/ready` readiness probe; +- writable local-store mount; +- rollback checks that include `phase-memory-service --check` and + `runtime.repair_diagnostics`. + +## Evaluation Trend History + +Persist trend artifacts into a history file after evaluation runs: + +```python +from phase_memory import write_evaluation_trend_history + +history = write_evaluation_trend_history("reports/evaluation-trend-history.json", trend) +``` + +Repeated writes of the same trend id do not duplicate the run. + +## Troubleshooting Matrix + +| Category | Diagnostic | Operator action | +| --- | --- | --- | +| Credentials | `credential_env_missing` | Set the four credential environment variables in the drill shell; do not write them to files. | +| Readiness | `unsupported_operation` | Run service contract and public API snapshot tests, then update dispatch or release notes. | +| Migrations | `store_migration_unsupported` | Use a file-backed local store or run repair diagnostics before accepting traffic. | +| Audit retention | `audit_retention_apply_unsupported` | Switch to a JSONL or telemetry audit sink with retention support, then rerun the retention drill. | +| Adapter manifest | `adapter_pack_manifest_invalid` | Regenerate and validate the adapter pack manifest before using the pack. | + ## Compatibility Release Discipline When public exports or service operations change: diff --git a/src/phase_memory/__init__.py b/src/phase_memory/__init__.py index b2b0795..766c948 100644 --- a/src/phase_memory/__init__.py +++ b/src/phase_memory/__init__.py @@ -13,12 +13,32 @@ from .contracts import graph_from_markitect, profile_from_markitect from .credentialed_drills import ( CREDENTIALED_ADAPTER_ENV_VARS, CREDENTIALED_DRILL_SCHEMA, + CREDENTIALED_OPERATOR_REPORT_SCHEMA, + CREDENTIALED_TELEMETRY_DRILL_SCHEMA, CredentialedDrillConfig, credentialed_adapter_smoke_report, credentialed_drill_config_from_env, + credentialed_operator_report, + credentialed_telemetry_retention_drill, missing_credentialed_adapter_env, + write_credentialed_operator_report, +) +from .deployment import ( + MANAGED_DEPLOYMENT_SCHEMA, + MANAGED_DEPLOYMENT_VALIDATION_SCHEMA, + managed_deployment_manifest, + validate_managed_deployment_manifest, +) +from .evaluation import ( + EVALUATION_REPORT_SCHEMA, + EVALUATION_TREND_HISTORY_SCHEMA, + EVALUATION_TREND_SCHEMA, + evaluation_threshold_report, + evaluation_trend_artifact, + evaluation_trend_history, + load_evaluation_trend_history, + write_evaluation_trend_history, ) -from .evaluation import EVALUATION_REPORT_SCHEMA, EVALUATION_TREND_SCHEMA, evaluation_threshold_report, evaluation_trend_artifact from .external_adapters import ( ADAPTER_PACK_MANIFEST_SCHEMA, ExternalAdapterPack, @@ -88,16 +108,25 @@ from .service_app import SERVICE_APP_SCHEMA, ServiceAppConfig, build_service_bin from .service_binding import READINESS_REPORT_SCHEMA, SERVICE_BINDING_SCHEMA, ServiceBinding, ServiceResponse, service_binding_from_config from .planner import plan_profile_execution from .runtime import PhaseMemoryRuntime +from .troubleshooting import ( + TROUBLESHOOTING_MATRIX_SCHEMA, + TROUBLESHOOTING_REQUIRED_CATEGORIES, + operator_troubleshooting_matrix, + validate_operator_troubleshooting_matrix, +) __all__ = [ "ActivationPlan", "ADAPTER_PACK_MANIFEST_SCHEMA", "CREDENTIALED_ADAPTER_ENV_VARS", "CREDENTIALED_DRILL_SCHEMA", + "CREDENTIALED_OPERATOR_REPORT_SCHEMA", + "CREDENTIALED_TELEMETRY_DRILL_SCHEMA", "CredentialedDrillConfig", "Diagnostic", "ExternalAdapterPack", "EVALUATION_REPORT_SCHEMA", + "EVALUATION_TREND_HISTORY_SCHEMA", "EVALUATION_TREND_SCHEMA", "FakeExternalEventLog", "FakeExternalGraphStore", @@ -137,6 +166,8 @@ __all__ = [ "MemoryOperation", "MARKITECT_PACKAGE_REQUEST_SCHEMA", "MARKITECT_PACKAGE_RESPONSE_SCHEMA", + "MANAGED_DEPLOYMENT_SCHEMA", + "MANAGED_DEPLOYMENT_VALIDATION_SCHEMA", "LocalMarkitectValidator", "OptionalMarkitectValidator", "abandon_path", @@ -145,11 +176,15 @@ __all__ = [ "create_path", "credentialed_adapter_smoke_report", "credentialed_drill_config_from_env", + "credentialed_operator_report", + "credentialed_telemetry_retention_drill", "graph_from_markitect", "evaluation_threshold_report", "evaluation_trend_artifact", + "evaluation_trend_history", "merge_path", "make_review_record", + "managed_deployment_manifest", "plan_activation", "plan_compaction", "plan_lifecycle_from_profile", @@ -167,6 +202,7 @@ __all__ = [ "adapter_pack_manifest", "validate_adapter_pack_manifest", "path_event", + "operator_troubleshooting_matrix", "package_request_from_selection", "package_response_envelope", "WordCountTokenEstimator", @@ -183,6 +219,8 @@ __all__ = [ "ServiceBinding", "ServiceAppConfig", "ServiceResponse", + "TROUBLESHOOTING_MATRIX_SCHEMA", + "TROUBLESHOOTING_REQUIRED_CATEGORIES", "build_service_binding", "create_wsgi_app", "health_report", @@ -191,6 +229,11 @@ __all__ = [ "service_binding_from_config", "service_app_metadata", "service_contracts", + "load_evaluation_trend_history", + "validate_managed_deployment_manifest", + "validate_operator_troubleshooting_matrix", + "write_credentialed_operator_report", + "write_evaluation_trend_history", ] __version__ = "0.1.0" diff --git a/src/phase_memory/credentialed_drills.py b/src/phase_memory/credentialed_drills.py index f3cbe63..5b14209 100644 --- a/src/phase_memory/credentialed_drills.py +++ b/src/phase_memory/credentialed_drills.py @@ -7,14 +7,20 @@ manifest/conformance path used by live-shaped adapter fixtures. from __future__ import annotations +import json from dataclasses import dataclass +from datetime import datetime, timezone +from pathlib import Path from typing import Mapping from .external_adapters import live_shaped_adapter_pack, validate_adapter_pack_manifest +from .runtime import PhaseMemoryRuntime from .service import default_conformance_adapters from .utils import stable_digest CREDENTIALED_DRILL_SCHEMA = "phase_memory.credentialed_adapter_drill.v1" +CREDENTIALED_OPERATOR_REPORT_SCHEMA = "phase_memory.credentialed_operator_report.v1" +CREDENTIALED_TELEMETRY_DRILL_SCHEMA = "phase_memory.credentialed_telemetry_retention_drill.v1" CREDENTIALED_ADAPTER_ENV_VARS = ( "PHASE_MEMORY_MARKITECT_URL", "PHASE_MEMORY_MARKITECT_TOKEN", @@ -30,8 +36,8 @@ class CredentialedDrillConfig: def to_dict(self) -> dict[str, str]: return { - "markitect_url": self.markitect_url, - "kontextual_url": self.kontextual_url, + "markitect_endpoint_fingerprint": stable_digest(self.markitect_url), + "kontextual_endpoint_fingerprint": stable_digest(self.kontextual_url), "credential_fingerprint": stable_digest([self.markitect_url, self.kontextual_url]), } @@ -84,3 +90,110 @@ def credentialed_adapter_smoke_report(environ: Mapping[str, str] | None = None) "conformance_helpers": sorted(conformance_adapters), "diagnostics": [diagnostic.to_dict() for diagnostic in diagnostics], } + + +def credentialed_operator_report( + environ: Mapping[str, str] | None = None, + *, + run_id: str = "operator-drill", + mode: str = "credentialed", +) -> dict: + smoke = credentialed_adapter_smoke_report(environ) + present = sorted(name for name in CREDENTIALED_ADAPTER_ENV_VARS if (environ or {}).get(name)) + return { + "schema_version": CREDENTIALED_OPERATOR_REPORT_SCHEMA, + "id": f"credentialed-operator-report:{stable_digest([run_id, smoke])}", + "run": {"id": run_id, "mode": mode}, + "valid": bool(smoke.get("valid")), + "skipped": bool(smoke.get("skipped")), + "redacted_env": { + "present": present, + "missing": list(smoke.get("missing_env", ())), + "secrets_redacted": True, + }, + "smoke_report": smoke, + "diagnostics": list(smoke.get("diagnostics", ())), + } + + +def write_credentialed_operator_report( + path: str | Path, + environ: Mapping[str, str] | None = None, + *, + run_id: str = "operator-drill", + mode: str = "credentialed", +) -> dict: + report = credentialed_operator_report(environ, run_id=run_id, mode=mode) + path = Path(path) + path.parent.mkdir(parents=True, exist_ok=True) + path.write_text(json.dumps(report, indent=2, sort_keys=True) + "\n", encoding="utf-8") + return report + + +def credentialed_telemetry_retention_drill( + environ: Mapping[str, str] | None = None, + *, + retention_days: int = 30, + operator_approved_fixture: bool = False, + now: datetime | None = None, +) -> dict: + missing = missing_credentialed_adapter_env(environ) + if missing and not operator_approved_fixture: + return { + "schema_version": CREDENTIALED_TELEMETRY_DRILL_SCHEMA, + "valid": False, + "skipped": True, + "missing_env": list(missing), + "diagnostics": [ + { + "severity": "warn", + "code": "credential_env_missing", + "message": "Telemetry retention drill skipped because required environment variables are missing.", + "metadata": {"required_env": list(CREDENTIALED_ADAPTER_ENV_VARS), "missing_env": list(missing)}, + } + ], + } + + pack = live_shaped_adapter_pack() + runtime = PhaseMemoryRuntime(audit_sink=pack.adapters["audit_sink"]) + runtime.audit_sink.record(_audit_fixture("op:old", "2026-01-01T00:00:00+00:00")) + runtime.audit_sink.record(_audit_fixture("op:new", "2026-05-18T00:00:00+00:00")) + now = now or datetime(2026, 5, 19, tzinfo=timezone.utc) + plan = runtime.audit_retention_plan(retention_days=retention_days, now=now, source_ref="credentialed-telemetry-drill") + applied = runtime.apply_audit_retention(plan["plan"], source_ref="credentialed-telemetry-drill") + export = runtime.export_audit_events(source_ref="credentialed-telemetry-drill") + retained_ids = [ + str(event.get("operation_id") or "") + for event in export["batch"]["events"] + if event.get("operation") != "audit.export" + ] + return { + "schema_version": CREDENTIALED_TELEMETRY_DRILL_SCHEMA, + "valid": plan["valid"] and applied["valid"], + "skipped": False, + "operator_approved_fixture": operator_approved_fixture, + "redacted_env": { + "present": sorted(name for name in CREDENTIALED_ADAPTER_ENV_VARS if (environ or {}).get(name)), + "missing": list(missing), + "secrets_redacted": True, + }, + "plan": plan["plan"], + "apply_result": applied["result"], + "pruned_operation_ids": applied["result"].get("pruned_operation_ids", []), + "retained_operation_ids": retained_ids, + "audit_operations": [event.get("operation") for event in export["batch"]["events"]], + "diagnostics": plan["diagnostics"] + applied["diagnostics"], + } + + +def _audit_fixture(operation_id: str, timestamp: str) -> dict: + return { + "schema_version": "phase_memory.audit.event.v1", + "operation_id": operation_id, + "operation": "credentialed.telemetry.fixture", + "timestamp": timestamp, + "subject": {"kind": "audit_events", "id": operation_id}, + "source": {"ref": "credentialed-telemetry-drill"}, + "dry_run": True, + "allowed": True, + } diff --git a/src/phase_memory/deployment.py b/src/phase_memory/deployment.py new file mode 100644 index 0000000..7f2f842 --- /dev/null +++ b/src/phase_memory/deployment.py @@ -0,0 +1,128 @@ +"""Managed deployment manifest helpers for the service app.""" + +from __future__ import annotations + +from typing import Any + +from .models import Diagnostic +from .service_app import ServiceAppConfig, service_app_metadata +from .utils import stable_digest + +MANAGED_DEPLOYMENT_SCHEMA = "phase_memory.managed_deployment.v1" +MANAGED_DEPLOYMENT_VALIDATION_SCHEMA = "phase_memory.managed_deployment.validation.v1" + + +def managed_deployment_manifest( + config: ServiceAppConfig | None = None, + *, + image: str = "phase-memory:local", + namespace: str = "phase-memory", + replicas: int = 1, +) -> dict[str, Any]: + config = config or ServiceAppConfig(host="0.0.0.0") + command = [ + "phase-memory-service", + "--host", + config.host, + "--port", + str(config.port), + "--store", + config.local_store_path, + ] + manifest = { + "schema_version": MANAGED_DEPLOYMENT_SCHEMA, + "service": { + "name": "phase-memory-service", + "namespace": namespace, + "image": image, + "replicas": replicas, + "command": command, + "ports": [{"name": "http", "container_port": config.port}], + }, + "runtime": service_app_metadata(config), + "storage": { + "volumes": [ + { + "name": "phase-memory-local-store", + "mount_path": config.local_store_path, + "purpose": "local graph, event, and audit state", + } + ] + }, + "probes": { + "liveness": {"path": "/health", "port": config.port}, + "readiness": {"path": "/ready", "port": config.port}, + }, + "rollback": { + "requires_store_snapshot": True, + "checks": ["phase-memory-service --check", "GET /ready", "runtime.repair_diagnostics"], + }, + } + manifest["id"] = f"managed-deployment:{stable_digest(manifest)}" + return manifest + + +def validate_managed_deployment_manifest(manifest: dict[str, Any]) -> dict[str, Any]: + diagnostics: list[Diagnostic] = [] + service = manifest.get("service") if isinstance(manifest.get("service"), dict) else {} + command = service.get("command") if isinstance(service.get("command"), list) else [] + probes = manifest.get("probes") if isinstance(manifest.get("probes"), dict) else {} + storage = manifest.get("storage") if isinstance(manifest.get("storage"), dict) else {} + volumes = storage.get("volumes") if isinstance(storage.get("volumes"), list) else [] + + if manifest.get("schema_version") != MANAGED_DEPLOYMENT_SCHEMA: + diagnostics.append( + Diagnostic( + "error", + "managed_deployment_schema_mismatch", + "Managed deployment manifest has an unexpected schema version.", + "schema_version", + {"expected": MANAGED_DEPLOYMENT_SCHEMA}, + ) + ) + if "phase-memory-service" not in command: + diagnostics.append( + Diagnostic( + "error", + "managed_deployment_missing_service_entrypoint", + "Managed deployment command must invoke phase-memory-service.", + "service.command", + ) + ) + for name, path in (("liveness", "/health"), ("readiness", "/ready")): + probe = probes.get(name) if isinstance(probes.get(name), dict) else {} + if probe.get("path") != path: + diagnostics.append( + Diagnostic( + "error", + "managed_deployment_probe_missing", + "Managed deployment manifest must declare health and readiness probes.", + f"probes.{name}.path", + {"expected": path}, + ) + ) + if not any(volume.get("mount_path") for volume in volumes if isinstance(volume, dict)): + diagnostics.append( + Diagnostic( + "error", + "managed_deployment_store_mount_missing", + "Managed deployment manifest must include a writable local-store mount.", + "storage.volumes", + ) + ) + if int(service.get("replicas") or 0) < 1: + diagnostics.append( + Diagnostic( + "error", + "managed_deployment_replica_count_invalid", + "Managed deployment manifest must request at least one replica.", + "service.replicas", + ) + ) + + return { + "schema_version": MANAGED_DEPLOYMENT_VALIDATION_SCHEMA, + "valid": not any(diagnostic.severity == "error" for diagnostic in diagnostics), + "manifest_id": str(manifest.get("id") or ""), + "diagnostics": [diagnostic.to_dict() for diagnostic in diagnostics], + } diff --git a/src/phase_memory/evaluation.py b/src/phase_memory/evaluation.py index 71b1ead..7f160bc 100644 --- a/src/phase_memory/evaluation.py +++ b/src/phase_memory/evaluation.py @@ -2,7 +2,9 @@ from __future__ import annotations +import json from datetime import datetime, timezone +from pathlib import Path from typing import Any from .adapters import InMemorySemanticIndex @@ -14,6 +16,7 @@ from .utils import stable_digest, utc_now_iso EVALUATION_REPORT_SCHEMA = "phase_memory.evaluation.threshold_report.v1" EVALUATION_TREND_SCHEMA = "phase_memory.evaluation.trend_artifact.v1" +EVALUATION_TREND_HISTORY_SCHEMA = "phase_memory.evaluation.trend_history.v1" DEFAULT_THRESHOLDS = { "policy_denial_count": 1, @@ -115,6 +118,68 @@ def evaluation_trend_artifact( "previous_report_id": (previous_report or {}).get("id", ""), "diagnostics": diagnostics, } + + +def evaluation_trend_history(artifacts: list[dict[str, Any]] | tuple[dict[str, Any], ...]) -> dict[str, Any]: + ordered = sorted( + (dict(artifact) for artifact in artifacts), + key=lambda artifact: ( + str((artifact.get("run") or {}).get("created_at") or ""), + str((artifact.get("run") or {}).get("run_id") or ""), + str(artifact.get("id") or ""), + ), + ) + metric_keys = sorted({str(key) for artifact in ordered for key in (artifact.get("metrics") or {})}) + diagnostics = [ + Diagnostic( + "error", + "evaluation_trend_history_invalid_artifact", + "Trend history can only contain evaluation trend artifacts.", + f"artifacts.{index}.schema_version", + {"artifact_id": artifact.get("id", "")}, + ).to_dict() + for index, artifact in enumerate(ordered) + if artifact.get("schema_version") != EVALUATION_TREND_SCHEMA + ] + return { + "schema_version": EVALUATION_TREND_HISTORY_SCHEMA, + "id": f"evaluation-trend-history:{stable_digest([artifact.get('id', '') for artifact in ordered])}", + "valid": not diagnostics, + "count": len(ordered), + "metric_keys": metric_keys, + "latest_artifact_id": ordered[-1].get("id", "") if ordered else "", + "artifacts": ordered, + "diagnostics": diagnostics, + } + + +def load_evaluation_trend_history(path: str | Path) -> dict[str, Any]: + path = Path(path) + if not path.exists(): + return evaluation_trend_history(()) + data = json.loads(path.read_text(encoding="utf-8")) + if data.get("schema_version") == EVALUATION_TREND_HISTORY_SCHEMA: + return data + if data.get("schema_version") == EVALUATION_TREND_SCHEMA: + return evaluation_trend_history((data,)) + return evaluation_trend_history((data,)) + + +def write_evaluation_trend_history(path: str | Path, artifact: dict[str, Any]) -> dict[str, Any]: + path = Path(path) + existing = load_evaluation_trend_history(path) + artifacts = list(existing.get("artifacts") or ()) + artifact_id = str(artifact.get("id") or "") + if artifact_id and not any(str(item.get("id") or "") == artifact_id for item in artifacts): + artifacts.append(dict(artifact)) + elif not artifact_id: + artifacts.append(dict(artifact)) + history = evaluation_trend_history(tuple(artifacts)) + path.parent.mkdir(parents=True, exist_ok=True) + path.write_text(json.dumps(history, indent=2, sort_keys=True) + "\n", encoding="utf-8") + return history + + def _policy_scenario(scenario: dict[str, Any]) -> dict[str, Any]: runtime = PhaseMemoryRuntime() response = runtime.plan_activation( diff --git a/src/phase_memory/troubleshooting.py b/src/phase_memory/troubleshooting.py new file mode 100644 index 0000000..edbd88b --- /dev/null +++ b/src/phase_memory/troubleshooting.py @@ -0,0 +1,112 @@ +"""Operator troubleshooting matrix for readiness and deployment drills.""" + +from __future__ import annotations + +from typing import Any + +from .models import Diagnostic +from .utils import stable_digest + +TROUBLESHOOTING_MATRIX_SCHEMA = "phase_memory.operator_troubleshooting.v1" +TROUBLESHOOTING_REQUIRED_CATEGORIES = ( + "credentials", + "readiness", + "migrations", + "audit_retention", + "adapter_manifest", +) + + +def operator_troubleshooting_matrix() -> dict[str, Any]: + rows = [ + { + "category": "credentials", + "diagnostic_code": "credential_env_missing", + "signal": "Credentialed drill reports skipped: true.", + "likely_cause": "One or more required operator environment variables are absent.", + "operator_action": "Set the Markitect and Kontextual URL/token variables in the shell that runs the drill; do not persist them to files.", + }, + { + "category": "readiness", + "diagnostic_code": "unsupported_operation", + "signal": "Readiness reports unsupported operations or a failing /ready response.", + "likely_cause": "The service binding and local runner operation catalogs are out of sync.", + "operator_action": "Run the public API snapshot and service contract tests, then update operation dispatch or release notes.", + }, + { + "category": "migrations", + "diagnostic_code": "store_migration_unsupported", + "signal": "Migration plan/apply returns an error diagnostic.", + "likely_cause": "The configured graph store does not expose migration planning.", + "operator_action": "Switch to a file-backed local store or run repair diagnostics before accepting traffic.", + }, + { + "category": "audit_retention", + "diagnostic_code": "audit_retention_apply_unsupported", + "signal": "Retention apply reports unsupported behavior.", + "likely_cause": "The configured audit sink cannot apply retention plans.", + "operator_action": "Use a JSONL or telemetry audit sink with retention support, then rerun the retention drill.", + }, + { + "category": "adapter_manifest", + "diagnostic_code": "adapter_pack_manifest_invalid", + "signal": "Adapter pack validation returns an error diagnostic.", + "likely_cause": "A required external adapter capability is missing from the pack.", + "operator_action": "Regenerate the adapter pack manifest and confirm graph, event, policy, package, audit, semantic, and registry adapters are declared.", + }, + ] + return { + "schema_version": TROUBLESHOOTING_MATRIX_SCHEMA, + "id": f"operator-troubleshooting:{stable_digest(rows)}", + "required_categories": list(TROUBLESHOOTING_REQUIRED_CATEGORIES), + "rows": rows, + } + + +def validate_operator_troubleshooting_matrix(matrix: dict[str, Any]) -> dict[str, Any]: + diagnostics: list[Diagnostic] = [] + rows = matrix.get("rows") if isinstance(matrix.get("rows"), list) else [] + categories = {str(row.get("category") or "") for row in rows if isinstance(row, dict)} + missing = sorted(set(TROUBLESHOOTING_REQUIRED_CATEGORIES) - categories) + if matrix.get("schema_version") != TROUBLESHOOTING_MATRIX_SCHEMA: + diagnostics.append( + Diagnostic( + "error", + "troubleshooting_matrix_schema_mismatch", + "Troubleshooting matrix has an unexpected schema version.", + "schema_version", + {"expected": TROUBLESHOOTING_MATRIX_SCHEMA}, + ) + ) + if missing: + diagnostics.append( + Diagnostic( + "error", + "troubleshooting_matrix_missing_category", + "Troubleshooting matrix must cover every required operator category.", + "rows", + {"missing": missing}, + ) + ) + required_fields = ("category", "diagnostic_code", "signal", "likely_cause", "operator_action") + for index, row in enumerate(rows): + if not isinstance(row, dict): + diagnostics.append( + Diagnostic("error", "troubleshooting_matrix_invalid_row", "Troubleshooting row must be an object.", f"rows.{index}") + ) + continue + for field in required_fields: + if not row.get(field): + diagnostics.append( + Diagnostic( + "error", + "troubleshooting_matrix_missing_field", + "Troubleshooting row is missing a required field.", + f"rows.{index}.{field}", + ) + ) + return { + "schema_version": f"{TROUBLESHOOTING_MATRIX_SCHEMA}.validation", + "valid": not any(diagnostic.severity == "error" for diagnostic in diagnostics), + "diagnostics": [diagnostic.to_dict() for diagnostic in diagnostics], + } diff --git a/tests/fixtures/public-api-snapshot.json b/tests/fixtures/public-api-snapshot.json index ac26d56..2a0431c 100644 --- a/tests/fixtures/public-api-snapshot.json +++ b/tests/fixtures/public-api-snapshot.json @@ -7,9 +7,12 @@ "ActivationPlan", "CREDENTIALED_ADAPTER_ENV_VARS", "CREDENTIALED_DRILL_SCHEMA", + "CREDENTIALED_OPERATOR_REPORT_SCHEMA", + "CREDENTIALED_TELEMETRY_DRILL_SCHEMA", "CredentialedDrillConfig", "Diagnostic", "EVALUATION_REPORT_SCHEMA", + "EVALUATION_TREND_HISTORY_SCHEMA", "EVALUATION_TREND_SCHEMA", "ExternalAdapterPack", "FakeExternalEventLog", @@ -32,6 +35,8 @@ "LiveShapedTelemetryAuditSink", "LocalMarkitectValidator", "LocalServiceRunner", + "MANAGED_DEPLOYMENT_SCHEMA", + "MANAGED_DEPLOYMENT_VALIDATION_SCHEMA", "MARKITECT_PACKAGE_REQUEST_SCHEMA", "MARKITECT_PACKAGE_RESPONSE_SCHEMA", "MemoryEdge", @@ -61,6 +66,8 @@ "ServiceAppConfig", "ServiceBinding", "ServiceResponse", + "TROUBLESHOOTING_MATRIX_SCHEMA", + "TROUBLESHOOTING_REQUIRED_CATEGORIES", "WordCountTokenEstimator", "abandon_path", "activation_quality_report", @@ -72,16 +79,22 @@ "create_wsgi_app", "credentialed_adapter_smoke_report", "credentialed_drill_config_from_env", + "credentialed_operator_report", + "credentialed_telemetry_retention_drill", "evaluation_threshold_report", "evaluation_trend_artifact", + "evaluation_trend_history", "fake_external_adapter_pack", "fake_external_runtime_config", "graph_from_markitect", "health_report", "live_shaped_adapter_pack", + "load_evaluation_trend_history", "make_review_record", + "managed_deployment_manifest", "merge_path", "missing_credentialed_adapter_env", + "operator_troubleshooting_matrix", "package_request_from_selection", "package_response_envelope", "path_event", @@ -103,7 +116,11 @@ "service_app_metadata", "service_binding_from_config", "service_contracts", - "validate_adapter_pack_manifest" + "validate_adapter_pack_manifest", + "validate_managed_deployment_manifest", + "validate_operator_troubleshooting_matrix", + "write_credentialed_operator_report", + "write_evaluation_trend_history" ], "service_operations": [ "audit.query", diff --git a/tests/test_credentialed_drills.py b/tests/test_credentialed_drills.py index 70fe5f4..753c437 100644 --- a/tests/test_credentialed_drills.py +++ b/tests/test_credentialed_drills.py @@ -1,11 +1,16 @@ import os +import json +from datetime import datetime, timezone import pytest from phase_memory.credentialed_drills import ( CREDENTIALED_ADAPTER_ENV_VARS, credentialed_adapter_smoke_report, + credentialed_operator_report, + credentialed_telemetry_retention_drill, missing_credentialed_adapter_env, + write_credentialed_operator_report, ) @@ -18,6 +23,43 @@ def test_credentialed_adapter_drill_reports_missing_env_without_secrets() -> Non assert report["diagnostics"][0]["code"] == "credential_env_missing" +def test_credentialed_operator_report_redacts_values_and_persists(tmp_path) -> None: + environ = { + "PHASE_MEMORY_MARKITECT_URL": "https://markitect.example.invalid", + "PHASE_MEMORY_MARKITECT_TOKEN": "markitect-secret-token", + "PHASE_MEMORY_KONTEXTUAL_URL": "https://kontextual.example.invalid", + "PHASE_MEMORY_KONTEXTUAL_TOKEN": "kontextual-secret-token", + } + + report = credentialed_operator_report(environ, run_id="pytest") + written = write_credentialed_operator_report(tmp_path / "operator-report.json", environ, run_id="pytest") + serialized = json.dumps(written, sort_keys=True) + + assert report["valid"] is True + assert written["id"] == report["id"] + assert written["redacted_env"]["secrets_redacted"] is True + assert "markitect-secret-token" not in serialized + assert "kontextual-secret-token" not in serialized + assert "https://markitect.example.invalid" not in serialized + assert "https://kontextual.example.invalid" not in serialized + assert (tmp_path / "operator-report.json").exists() + + +def test_credentialed_telemetry_retention_drill_prunes_fixture_events() -> None: + report = credentialed_telemetry_retention_drill( + {}, + operator_approved_fixture=True, + retention_days=30, + now=datetime(2026, 5, 19, tzinfo=timezone.utc), + ) + + assert report["valid"] is True + assert report["skipped"] is False + assert "op:old" in report["pruned_operation_ids"] + assert "op:new" in report["retained_operation_ids"] + assert "audit.retention.apply" in report["audit_operations"] + + @pytest.mark.skipif( missing_credentialed_adapter_env(os.environ), reason="requires env vars: " + ", ".join(CREDENTIALED_ADAPTER_ENV_VARS), diff --git a/tests/test_deployment.py b/tests/test_deployment.py new file mode 100644 index 0000000..22ed4aa --- /dev/null +++ b/tests/test_deployment.py @@ -0,0 +1,44 @@ +from phase_memory.deployment import ( + MANAGED_DEPLOYMENT_SCHEMA, + managed_deployment_manifest, + validate_managed_deployment_manifest, +) +from phase_memory.service_app import ServiceAppConfig + + +def test_managed_deployment_manifest_declares_entrypoint_probes_and_store() -> None: + manifest = managed_deployment_manifest( + ServiceAppConfig(host="0.0.0.0", port=8090, local_store_path="/var/lib/phase-memory"), + image="registry.example/phase-memory:test", + namespace="agents", + replicas=2, + ) + validation = validate_managed_deployment_manifest(manifest) + + assert manifest["schema_version"] == MANAGED_DEPLOYMENT_SCHEMA + assert manifest["service"]["command"][0] == "phase-memory-service" + assert manifest["service"]["ports"][0]["container_port"] == 8090 + assert manifest["probes"]["liveness"]["path"] == "/health" + assert manifest["probes"]["readiness"]["path"] == "/ready" + assert manifest["storage"]["volumes"][0]["mount_path"] == "/var/lib/phase-memory" + assert manifest["rollback"]["requires_store_snapshot"] is True + assert validation["valid"] is True + assert validation["diagnostics"] == [] + + +def test_managed_deployment_validation_reports_missing_contracts() -> None: + manifest = { + "schema_version": MANAGED_DEPLOYMENT_SCHEMA, + "service": {"command": ["python"], "replicas": 0}, + "probes": {"liveness": {"path": "/wrong"}}, + "storage": {"volumes": []}, + } + + validation = validate_managed_deployment_manifest(manifest) + codes = {diagnostic["code"] for diagnostic in validation["diagnostics"]} + + assert validation["valid"] is False + assert "managed_deployment_missing_service_entrypoint" in codes + assert "managed_deployment_probe_missing" in codes + assert "managed_deployment_store_mount_missing" in codes + assert "managed_deployment_replica_count_invalid" in codes diff --git a/tests/test_evaluation_scenarios.py b/tests/test_evaluation_scenarios.py index ba4ce71..ffdefff 100644 --- a/tests/test_evaluation_scenarios.py +++ b/tests/test_evaluation_scenarios.py @@ -4,7 +4,15 @@ from pathlib import Path from phase_memory.adapters import InMemorySemanticIndex from phase_memory.contracts import graph_from_markitect -from phase_memory.evaluation import EVALUATION_REPORT_SCHEMA, EVALUATION_TREND_SCHEMA, evaluation_threshold_report, evaluation_trend_artifact +from phase_memory.evaluation import ( + EVALUATION_REPORT_SCHEMA, + EVALUATION_TREND_HISTORY_SCHEMA, + EVALUATION_TREND_SCHEMA, + evaluation_threshold_report, + evaluation_trend_artifact, + load_evaluation_trend_history, + write_evaluation_trend_history, +) from phase_memory.models import ActivationPlan, MemoryPath from phase_memory.retrieval import activation_quality_report, select_event_path from phase_memory.runtime import PhaseMemoryRuntime @@ -126,6 +134,31 @@ def test_evaluation_trend_artifact_tracks_threshold_and_metric_deltas() -> None: assert trend["diagnostics"][0]["code"] == "evaluation_metric_regressed" +def test_evaluation_trend_history_persists_without_duplicate_runs(tmp_path) -> None: + data = json.loads((FIXTURES / "evaluation-scenarios.json").read_text(encoding="utf-8")) + report = evaluation_threshold_report(data) + first = evaluation_trend_artifact( + report, + run_metadata={"run_id": "first", "created_at": "2026-05-19T00:00:00+00:00"}, + ) + second = evaluation_trend_artifact( + report, + previous_report=report, + run_metadata={"run_id": "second", "created_at": "2026-05-20T00:00:00+00:00"}, + ) + path = tmp_path / "evaluation-trend-history.json" + + history = write_evaluation_trend_history(path, first) + history = write_evaluation_trend_history(path, first) + history = write_evaluation_trend_history(path, second) + loaded = load_evaluation_trend_history(path) + + assert history["schema_version"] == EVALUATION_TREND_HISTORY_SCHEMA + assert loaded["count"] == 2 + assert loaded["latest_artifact_id"] == second["id"] + assert "policy_denial_count" in loaded["metric_keys"] + + def _activation_plan(response): data = response["data"]["activation_plan"] return ActivationPlan( diff --git a/tests/test_troubleshooting.py b/tests/test_troubleshooting.py new file mode 100644 index 0000000..70c9b08 --- /dev/null +++ b/tests/test_troubleshooting.py @@ -0,0 +1,29 @@ +from phase_memory.troubleshooting import ( + TROUBLESHOOTING_REQUIRED_CATEGORIES, + operator_troubleshooting_matrix, + validate_operator_troubleshooting_matrix, +) + + +def test_operator_troubleshooting_matrix_covers_required_categories() -> None: + matrix = operator_troubleshooting_matrix() + validation = validate_operator_troubleshooting_matrix(matrix) + categories = {row["category"] for row in matrix["rows"]} + + assert set(TROUBLESHOOTING_REQUIRED_CATEGORIES) <= categories + assert validation["valid"] is True + assert validation["diagnostics"] == [] + + +def test_operator_troubleshooting_matrix_validation_reports_missing_fields() -> None: + matrix = { + "schema_version": "phase_memory.operator_troubleshooting.v1", + "rows": [{"category": "credentials", "diagnostic_code": "credential_env_missing"}], + } + + validation = validate_operator_troubleshooting_matrix(matrix) + codes = {diagnostic["code"] for diagnostic in validation["diagnostics"]} + + assert validation["valid"] is False + assert "troubleshooting_matrix_missing_category" in codes + assert "troubleshooting_matrix_missing_field" in codes diff --git a/workplans/PMEM-WP-0014-live-credential-execution-and-managed-deployment-hardening.md b/workplans/PMEM-WP-0014-live-credential-execution-and-managed-deployment-hardening.md index ca49120..f46c58b 100644 --- a/workplans/PMEM-WP-0014-live-credential-execution-and-managed-deployment-hardening.md +++ b/workplans/PMEM-WP-0014-live-credential-execution-and-managed-deployment-hardening.md @@ -4,7 +4,7 @@ type: workplan title: "Live Credential Execution And Managed Deployment Hardening" domain: markitect repo: phase-memory -status: ready +status: finished owner: codex topic_slug: phase-memory created: "2026-05-19" @@ -36,7 +36,7 @@ release-note discipline. The scorecard now rates the repo at **4.3 / 5**. ```task id: PMEM-WP-0014-T01 -status: todo +status: done priority: high state_hub_task_id: "1d0eb51c-60ce-47ad-bd91-6ce1ee91f0f8" ``` @@ -54,7 +54,7 @@ Acceptance: ```task id: PMEM-WP-0014-T02 -status: todo +status: done priority: high state_hub_task_id: "37b03680-fcc4-46c2-9ce2-f6bf1f2ef35b" ``` @@ -71,7 +71,7 @@ Acceptance: ```task id: PMEM-WP-0014-T03 -status: todo +status: done priority: medium state_hub_task_id: "a3260267-bc8f-4f17-abdd-2296ad2c6ed5" ``` @@ -88,7 +88,7 @@ Acceptance: ```task id: PMEM-WP-0014-T04 -status: todo +status: done priority: medium state_hub_task_id: "b68478ce-90c2-4e21-b621-569cb6925f74" ``` @@ -106,7 +106,7 @@ Acceptance: ```task id: PMEM-WP-0014-T05 -status: todo +status: done priority: medium state_hub_task_id: "b0974113-debd-4823-929a-761510132c09" ``` @@ -127,4 +127,21 @@ Acceptance: ## Closure Review -Pending implementation. +Implemented as a credential-safe operational hardening pass: + +- Credentialed drill configs now persist only endpoint/credential fingerprints, + and `credentialed_operator_report` / `write_credentialed_operator_report` + create redacted run artifacts. +- `credentialed_telemetry_retention_drill` exercises retention planning/apply + through the live-shaped telemetry sink or an operator-approved fixture. +- `managed_deployment_manifest` and + `validate_managed_deployment_manifest` define entrypoint, probe, rollback, + replica, and local-store mount expectations without requiring credentials. +- Evaluation trend artifacts can now be persisted into deterministic history + files without duplicate run ids. +- The operator runbook and troubleshooting matrix cover credential, + readiness, migration, retention, and adapter-manifest failures. + +No real endpoint credentials or managed platform were available in the default +workspace, so PMEM-WP-0015 should collect the first live credential and managed +deployment pilot evidence. diff --git a/workplans/PMEM-WP-0015-credentialed-live-pilot-and-deployment-evidence.md b/workplans/PMEM-WP-0015-credentialed-live-pilot-and-deployment-evidence.md new file mode 100644 index 0000000..62048b4 --- /dev/null +++ b/workplans/PMEM-WP-0015-credentialed-live-pilot-and-deployment-evidence.md @@ -0,0 +1,138 @@ +--- +id: PMEM-WP-0015 +type: workplan +title: "Credentialed Live Pilot And Deployment Evidence" +domain: markitect +repo: phase-memory +status: ready +owner: codex +topic_slug: phase-memory +created: "2026-05-19" +updated: "2026-05-19" +state_hub_workstream_id: "10e406f3-a016-46f6-92c4-9e0f8fc7ecc3" +--- + +# PMEM-WP-0015: Credentialed Live Pilot And Deployment Evidence + +## Goal + +Collect the first real operator evidence for live Markitect/Kontextual +credentials, managed deployment packaging, telemetry retention, and evaluation +history gates without committing credentials or endpoint secrets. + +## Current Evidence + +`PMEM-WP-0014` added redacted operator reports, credential-safe telemetry +retention drills, managed deployment manifest validation, deterministic +evaluation trend history persistence, and an operator troubleshooting matrix. +The remaining maturity gap is live evidence from an approved operator +environment and deployment target. + +## Non-Goals + +- Commit tokens, live endpoint URLs, or platform secrets. +- Make live credential tests mandatory for default CI. +- Replace platform-specific deployment tooling owned by operators. + +## T01 - Run redacted credentialed live smoke report + +```task +id: PMEM-WP-0015-T01 +status: todo +priority: high +state_hub_task_id: "c095a240-0499-42a2-8661-7d4ead13d90e" +``` + +Run the credentialed operator report against approved live Markitect and +Kontextual endpoints. + +Acceptance: + +- Report artifact contains no tokens or raw endpoint URLs. +- Live adapter incompatibilities are captured as diagnostics. +- Operator confirms the report can be shared through normal repo progress + channels. + +## T02 - Pilot managed deployment package + +```task +id: PMEM-WP-0015-T02 +status: todo +priority: high +state_hub_task_id: "94fd6cf0-348b-47ac-87d9-17f1fa358590" +``` + +Translate the managed deployment manifest into the target operator platform and +run readiness checks. + +Acceptance: + +- `/health` and `/ready` probes pass in the pilot environment. +- Local-store mount and rollback procedure are validated. +- Platform-specific notes are added to the operator runbook without taking + ownership of that platform. + +## T03 - Capture external telemetry retention evidence + +```task +id: PMEM-WP-0015-T03 +status: todo +priority: medium +state_hub_task_id: "31f114bf-a7cb-4413-ab9b-51c7c00552c4" +``` + +Exercise telemetry export and retention apply against the approved credentialed +telemetry boundary. + +Acceptance: + +- Retention apply records an audit event. +- Pruned and retained operation ids are reviewable. +- Secret-bearing fields are absent from exported artifacts. + +## T04 - Promote evaluation trend history into a gate + +```task +id: PMEM-WP-0015-T04 +status: todo +priority: medium +state_hub_task_id: "74ba5e2f-e3f9-49a7-b2e5-c73ec478b1ab" +``` + +Persist trend history across commits or run ids and define the regression gate +operators should inspect. + +Acceptance: + +- Trend history is written as a durable artifact. +- Regression diagnostics identify metric declines. +- Runbook explains how to compare the latest artifact with prior runs. + +## T05 - Fold pilot feedback into troubleshooting + +```task +id: PMEM-WP-0015-T05 +status: todo +priority: medium +state_hub_task_id: "427d5cd6-f8e0-4c2f-bced-e4679461ebc1" +``` + +Use live pilot findings to refine the troubleshooting matrix and scorecard. + +Acceptance: + +- New operator failure modes have diagnostic codes and remediations. +- Scorecard distinguishes implemented tooling from verified live evidence. +- Next maturity target is adjusted based on actual pilot results. + +## Acceptance Criteria + +- PMEM-WP-0015 produces credential-safe artifacts from a real operator + environment. +- Managed deployment readiness has platform evidence, not just local manifest + validation. +- Scorecard can reasonably move toward the 4.7+ gate if the pilot succeeds. + +## Closure Review + +Pending implementation.