Implement PMEM-WP-0015 credentialed live pilot with ops-warden routing.

Add credential routing advisories via warden route/access, live pilot evidence
helpers, managed deployment pilot probes, evaluation trend regression gates,
and expanded troubleshooting. Update operator runbook and maturity scorecard.
This commit is contained in:
2026-07-02 23:24:35 +02:00
parent bff90ec1ed
commit 29f893b905
15 changed files with 913 additions and 38 deletions

View File

@@ -1,6 +1,6 @@
# Operator Readiness Runbook
Updated: 2026-05-19
Updated: 2026-07-02
This runbook covers the operational path for `phase-memory` without requiring
credentials in the default test suite.
@@ -20,7 +20,16 @@ Credentialed drills require:
- `PHASE_MEMORY_KONTEXTUAL_URL`
- `PHASE_MEMORY_KONTEXTUAL_TOKEN`
Do not store those values in Git, workplans, progress logs, or release notes.
Obtain credentials through ops-warden routing — ops-warden does not vend
secret values:
```bash
warden route find "phase-memory markitect kontextual api token" --json
warden access "phase-memory markitect kontextual api token" --json
```
Export the returned values into the drill shell only. Do not store those values
in Git, workplans, progress logs, or release notes.
## Service Startup
@@ -117,6 +126,15 @@ Use export batches for operator review, not as a credential or secret store.
## Credentialed Drill
Resolve credential routing before running live drills:
```python
from phase_memory import resolve_credentialed_environ, warden_credential_routing_advisory
advisory = warden_credential_routing_advisory()
status = resolve_credentialed_environ()
```
Run the credentialed smoke test only from an operator environment:
```bash
@@ -150,6 +168,26 @@ report = credentialed_telemetry_retention_drill(operator_approved_fixture=True)
The drill records old and new audit events, plans retention, applies pruning,
and reports retained/pruned operation ids without storing credential values.
## Live Pilot Evidence
Collect credential-safe pilot artifacts for operator review:
```python
from phase_memory import write_live_pilot_evidence
write_live_pilot_evidence("reports/live-pilot", environ=os.environ)
```
Artifacts include:
- `live-pilot-report.json` — aggregate pilot status and live_evidence flags
- `credentialed-operator-report.json` — redacted smoke report
- `managed-deployment-pilot.json` — manifest validation and probe results
- `telemetry-retention-evidence.json` — retention apply audit trace
- `evaluation-trend-history.json` — persisted trend artifacts
- `evaluation-regression-gate.json` — operator regression gate
- `credential-routing-advisory.json` — ops-warden routing without secrets
## Managed Deployment Manifest
Build and validate a deployment manifest before handing it to platform-specific
@@ -186,6 +224,19 @@ history = write_evaluation_trend_history("reports/evaluation-trend-history.json"
Repeated writes of the same trend id do not duplicate the run.
Gate promotion on evaluation regressions:
```python
from phase_memory import evaluation_trend_regression_gate, load_evaluation_trend_history
history = load_evaluation_trend_history("reports/evaluation-trend-history.json")
gate = evaluation_trend_regression_gate(history)
```
Compare the latest artifact metrics in `evaluation-trend-history.json` against
the previous run id. Block promotion when `metric_regressions` or
`threshold_failures` are non-empty.
## Troubleshooting Matrix
| Category | Diagnostic | Operator action |
@@ -195,6 +246,10 @@ Repeated writes of the same trend id do not duplicate the run.
| Migrations | `store_migration_unsupported` | Use a file-backed local store or run repair diagnostics before accepting traffic. |
| Audit retention | `audit_retention_apply_unsupported` | Switch to a JSONL or telemetry audit sink with retention support, then rerun the retention drill. |
| Adapter manifest | `adapter_pack_manifest_invalid` | Regenerate and validate the adapter pack manifest before using the pack. |
| Credential routing | `warden_cli_unavailable` | Install warden from ops-warden, then run `warden route find` before exporting PHASE_MEMORY_* variables. |
| Deployment | `managed_deployment_probe_failed` | Run `phase-memory-service --check` and validate managed deployment manifest probes before promotion. |
| Evaluation | `evaluation_metric_regressed` | Compare latest and previous trend artifacts; inspect scenario diagnostics before release. |
| Pilot | `pilot_credentialed_env_missing` | Obtain credentials through ops-warden routing and rerun `write_live_pilot_evidence`. |
## Compatibility Release Discipline