8.4 KiB
Phase Memory Maturity Scorecard
Updated: 2026-05-19
Purpose
This scorecard tracks progress toward INTENT.md: a profile-driven,
phase-aware memory infrastructure layer for agentic systems.
The original scorecard treated roadmap closure and fake external adapters as near-operational maturity. The refined scoring below is stricter: fake adapters prove wiring and contracts, but live durability, migration, telemetry, service bindings, and broader evaluation corpora are still needed before scoring close to 5.
Scoring Model
| Score | Meaning |
|---|---|
| 0 | Not started. |
| 1 | Intent or docs only. |
| 2 | Deterministic local library behavior with tests. |
| 3 | Usable runtime or CLI behavior with stable envelopes. |
| 4 | Integration-ready local service boundary with policy, persistence, interop, and conformance coverage. |
| 5 | Operationally mature with live adapter implementations, migrations, telemetry, retention, service bindings, and evaluation gates. |
Current Score
Overall maturity: 4.2 / 5
Two sub-scores make the result easier to reason about:
- Local integration maturity: 4.5 / 5
- Operational maturity: 3.8 / 5
The repo is strong as a deterministic local library and service-boundary core. It is not yet production-operational because adapter coverage is still live-shaped rather than credentialed live integration, and service bindings are framework-neutral embedding surfaces rather than a deployed service.
Dimension Scorecard
| Dimension | Score | Target | Evidence | Needed Next |
|---|---|---|---|---|
| Intent and boundaries | 4.4 | 5.0 | INTENT.md, SCOPE.md, README.md, architecture docs, adjacent-repo boundary docs |
Keep docs current as live adapters and service bindings clarify real ownership. |
| Package and API foundation | 4.5 | 4.8 | Python package, public exports, runtime facade, CLI, service runner export, service config, dependency-light tests, public API snapshot | Add release notes discipline and compatibility migration examples. |
| Markitect profile contract ingress | 3.7 | 4.5 | Profile loading, diagnostics, runtime envelopes, profile-derived config, local alias normalization | Add richer compatibility fixtures and schema drift diagnostics. |
| Graph and event ingress | 4.0 | 4.5 | Graph loading, endpoint diagnostics, event model, JSONL log, export, repair checks, corrupt-record diagnostics, fake and live-shaped graph/event adapters | Add broader malformed/large graph fixtures and operator repair utilities. |
| Phase domain model | 3.5 | 4.5 | Phases, lifecycle states, actions, paths, retention rules, profile-derived transition rules | Add migration semantics for profile/rule changes over durable stores. |
| Profile execution planning | 4.3 | 4.5 | Adapter plan, capabilities, policy gates, fallback behavior, config-driven local/external resolution, adapter pack manifests, live-shaped compatibility gates | Add compatibility gates for credentialed live adapter packs. |
| Lifecycle planning and apply | 4.1 | 4.5 | Dry-run lifecycle plans, profile rules, review-gated local apply, service lifecycle.apply, apply audit/export queries |
Add richer apply rollback and repair drills. |
| Activation planning | 4.0 | 4.8 | Budgeted activation, selections, package request, graph neighborhoods, paths, ranking, metrics, multi-scenario evaluation fixtures | Wire semantic-index-assisted retrieval into runtime planning. |
| Local persistence | 4.0 | 4.5 | File-backed graph store, JSONL event log, audit sink, atomic JSON writes, executable metadata migrations, migration audit, export, repair diagnostics | Add compaction/retention utilities and stronger corruption recovery. |
| Policy, review, and audit | 4.2 | 5.0 | Operation points, review records, audit schema, queryable/exportable audit sinks, retention plans, denials, redaction, fake/live-shaped policy/audit adapters | Add live policy adapter boundary and enforceable audit retention pruning. |
| Observability and operations | 4.0 | 4.8 | Health report, readiness report, config diagnostics, adapter status, service binding, fake/live-shaped telemetry audit sinks, operational recipe | Add metrics/event export to external telemetry and deployable service packaging. |
| Markitect interop | 4.0 | 4.5 | Local validation, package request/response envelopes, fake and live-shaped compiler fixtures | Add optional credentialed Markitect compiler adapter and schema drift suite. |
| Kontextual/Infospace interop | 3.7 | 4.5 | Delegation envelope, fake and live-shaped runtime registry, activation quality report fixture, adapter compatibility manifests | Add credentialed Kontextual adapter drill and broader Infospace restart reports. |
| Testing and evaluation | 4.3 | 4.7 | Deterministic tests over runtime, CLI, adapters, policy, activation, lifecycle, service, fakes, live-shaped packs, API snapshots, and evaluation threshold reports | Add larger regression corpus and threshold trend reports. |
| Service readiness | 4.5 | 4.8 | Service contracts, full local runner parity, framework-neutral service binding, WSGI adapter, health/readiness, config, adapter conformance | Add deployable packaging and operator readiness runbooks. |
| Developer experience | 4.3 | 4.7 | README, package map, CLI examples, persistence/policy/interop/service/lifecycle/fake-pack docs, operational recipe, API compatibility docs | Add troubleshooting matrix and release note templates. |
Assessment
The project has crossed the local integration-readiness threshold. The runtime envelopes, policy/review model, profile-derived configuration, lifecycle rules, local persistence migrations, queryable/exportable audit path, fake and live-shaped external pack manifests, service binding, API snapshots, and conformance helpers form a solid integration boundary.
The biggest optimization opportunity is now the next operational layer: moving from live-shaped local fixtures to credentialed live adapter drills, packaging the service binding for deployment, and growing evaluation thresholds into trend reports.
Completed Refinement Workplan
PMEM-WP-0011 moved the score from 3.8 to 4.0 by adding:
- full local service runner parity for
SERVICE_OPERATIONS; - service-covered
package.compile,lifecycle.apply, andaudit.query; - queryable audit sinks with retention metadata;
- local-store atomic JSON writes, migration diagnostics, and corrupt-record repair diagnostics;
- three evaluation scenario families covering policy denial, lifecycle rules, event-path activation, semantic-index hints, and budget pressure;
- adapter pack manifests and explicit missing-capability diagnostics;
- an operational end-to-end recipe.
PMEM-WP-0012 moved the score from 4.0 to 4.2 by adding:
- framework-neutral
ServiceBindingand WSGI adapter tests without starting a listener; - executable local-store migration planning/apply behavior with audit traces;
- live-shaped Markitect/Kontextual/telemetry adapter fixtures behind the same manifest and conformance contract;
- audit retention plans and export batches;
- evaluation threshold reports over the scenario corpus;
- public API and service operation compatibility snapshots.
Recommended Next Refinement
Create and execute PMEM-WP-0013: credentialed adapter drills and deployment
packaging.
Highest-value tasks:
- Add optional credentialed Markitect/Kontextual adapter smoke drills that are skipped unless credentials are present.
- Package the service binding as a deployable local service with operator readiness checks.
- Add audit retention pruning and telemetry export enforcement.
- Grow evaluation reporting into historical threshold trends.
- Add release note and migration-note templates for compatibility changes.
Score Movement Gates
Achieved overall score 4.0 when:
- Service runner handles every operation in
SERVICE_OPERATIONS. - Audit query and lifecycle apply are covered through service contracts.
- Local persistence has migration diagnostics.
- Evaluation fixtures cover at least three profile/graph families.
Move overall score to 4.3+ when:
- Credentialed optional Markitect or Kontextual adapter smoke drills run behind the same conformance suite as the fake/live-shaped packs.
- Operational docs include deployable service packaging and an operator readiness runbook.
Move overall score to 4.7+ only when:
- Live adapter behavior, telemetry, audit retention, migration, and evaluation gates are all exercised by repeatable tests or documented operator drills.