Files
phase-memory/docs/operator-readiness-runbook.md

142 lines
3.6 KiB
Markdown

# Operator Readiness Runbook
Updated: 2026-05-19
This runbook covers the operational path for `phase-memory` without requiring
credentials in the default test suite.
## Modes
| Mode | Purpose | Credentials | Network |
| --- | --- | --- | --- |
| Local fixture | Default deterministic runtime and tests. | No | No |
| Live-shaped | Adapter manifests and behavior that model live services locally. | No | No |
| Credentialed live drill | Operator-provided smoke drill for real endpoints. | Yes, via env only | Optional |
Credentialed drills require:
- `PHASE_MEMORY_MARKITECT_URL`
- `PHASE_MEMORY_MARKITECT_TOKEN`
- `PHASE_MEMORY_KONTEXTUAL_URL`
- `PHASE_MEMORY_KONTEXTUAL_TOKEN`
Do not store those values in Git, workplans, progress logs, or release notes.
## Service Startup
The deployable stdlib entrypoint is `phase-memory-service`.
Readiness check without listening:
```bash
phase-memory-service --check --store .phase-memory-local
```
Start the stdlib WSGI service:
```bash
phase-memory-service --host 127.0.0.1 --port 8080 --store .phase-memory-local
```
Routes:
- `GET /health`
- `GET /ready`
- `GET /contracts`
- `POST /operations/{operation}`
- `POST /operations` with `{"operation": "...", "payload": {...}}`
## Readiness Checks
Before accepting traffic:
1. Run `phase-memory-service --check`.
2. Verify `/ready` reports `ok: true`.
3. Verify `unsupported_operations` is empty.
4. Verify adapter diagnostics have no `error` severity.
5. Verify the public API snapshot test passes after any operation/export change.
## Migration Apply
Plan and apply local-store metadata migrations through the runtime:
```python
from phase_memory import RuntimeConfig, runtime_from_config
config = RuntimeConfig(local_store_path=".phase-memory-local")
runtime = runtime_from_config(config)
plan = runtime.plan_store_migration(source_ref=config.local_store_path)
result = runtime.apply_store_migration(
plan["data"]["migration_plan"],
actor="operator",
source_ref=config.local_store_path,
)
```
Expected:
- no `error` diagnostics in the plan;
- `result["valid"] is True`;
- metadata is updated atomically;
- `audit.query` can find the `store.migration.apply` event.
Rollback:
- stop the service;
- restore the previous local store directory from backup;
- rerun `phase-memory-service --check`;
- rerun `runtime.repair_diagnostics()`.
## Audit Export And Retention
Plan retention:
```python
plan = runtime.audit_retention_plan(retention_days=30)
```
Apply retention:
```python
result = runtime.apply_audit_retention(plan["plan"])
```
Expected:
- eligible operation ids are pruned;
- `audit.retention.apply` is recorded after pruning;
- no retention apply happens when the sink reports unsupported behavior.
Export a trace batch:
```python
export = runtime.export_audit_events({"operation": "package.compile"})
```
Use export batches for operator review, not as a credential or secret store.
## Credentialed Drill
Run the credentialed smoke test only from an operator environment:
```bash
PHASE_MEMORY_MARKITECT_URL=... \
PHASE_MEMORY_MARKITECT_TOKEN=... \
PHASE_MEMORY_KONTEXTUAL_URL=... \
PHASE_MEMORY_KONTEXTUAL_TOKEN=... \
python3 -m pytest tests/test_credentialed_drills.py
```
The report redacts tokens and uses a credential fingerprint rather than
persisting secrets.
## Compatibility Release Discipline
When public exports or service operations change:
1. Update `tests/fixtures/public-api-snapshot.json`.
2. Fill in `docs/release-note-template.md`.
3. Call out changed exports, changed service operations, migration needs, and
operator action.
4. Link the workplan or decision that authorized the change.