generated from coulomb/repo-seed
123 lines
7.5 KiB
Markdown
123 lines
7.5 KiB
Markdown
# Phase Memory Maturity Scorecard
|
|
|
|
Updated: 2026-05-18
|
|
|
|
## Purpose
|
|
|
|
This scorecard tracks progress toward `INTENT.md`: a profile-driven,
|
|
phase-aware memory infrastructure layer for agentic systems.
|
|
|
|
The original scorecard treated roadmap closure and fake external adapters as
|
|
near-operational maturity. The refined scoring below is stricter: fake adapters
|
|
prove wiring and contracts, but live durability, migration, telemetry, service
|
|
bindings, and broader evaluation corpora are still needed before scoring close
|
|
to 5.
|
|
|
|
## Scoring Model
|
|
|
|
| Score | Meaning |
|
|
| --- | --- |
|
|
| 0 | Not started. |
|
|
| 1 | Intent or docs only. |
|
|
| 2 | Deterministic local library behavior with tests. |
|
|
| 3 | Usable runtime or CLI behavior with stable envelopes. |
|
|
| 4 | Integration-ready local service boundary with policy, persistence, interop, and conformance coverage. |
|
|
| 5 | Operationally mature with live adapter implementations, migrations, telemetry, retention, service bindings, and evaluation gates. |
|
|
|
|
## Current Score
|
|
|
|
Overall maturity: **4.0 / 5**
|
|
|
|
Two sub-scores make the result easier to reason about:
|
|
|
|
- Local integration maturity: **4.3 / 5**
|
|
- Operational maturity: **3.5 / 5**
|
|
|
|
The repo is strong as a deterministic local library and service-boundary core.
|
|
It is not yet production-operational because the external adapters are fakes,
|
|
service bindings are framework-neutral shapes rather than deployable endpoints,
|
|
and migration behavior is diagnostic rather than an operator-applied migration
|
|
system.
|
|
|
|
## Dimension Scorecard
|
|
|
|
| Dimension | Score | Target | Evidence | Needed Next |
|
|
| --- | ---: | ---: | --- | --- |
|
|
| Intent and boundaries | 4.4 | 5.0 | `INTENT.md`, `SCOPE.md`, `README.md`, architecture docs, adjacent-repo boundary docs | Keep docs current as live adapters and service bindings clarify real ownership. |
|
|
| Package and API foundation | 4.3 | 4.5 | Python package, public exports, runtime facade, CLI, service runner export, service config, dependency-light tests | Add public export compatibility checks and release notes discipline. |
|
|
| Markitect profile contract ingress | 3.7 | 4.5 | Profile loading, diagnostics, runtime envelopes, profile-derived config, local alias normalization | Add richer compatibility fixtures and schema drift diagnostics. |
|
|
| Graph and event ingress | 3.9 | 4.5 | Graph loading, endpoint diagnostics, event model, JSONL log, export, repair checks, corrupt-record diagnostics, fake graph/event adapters | Add broader malformed/large graph fixtures and operator repair utilities. |
|
|
| Phase domain model | 3.5 | 4.5 | Phases, lifecycle states, actions, paths, retention rules, profile-derived transition rules | Add migration semantics for profile/rule changes over durable stores. |
|
|
| Profile execution planning | 4.2 | 4.5 | Adapter plan, capabilities, policy gates, fallback behavior, config-driven local/external resolution, adapter pack manifests | Add compatibility gates for live adapter packs. |
|
|
| Lifecycle planning and apply | 4.0 | 4.5 | Dry-run lifecycle plans, profile rules, review-gated local apply, service `lifecycle.apply`, apply audit queries | Add operator migration semantics and richer apply rollback/repair drills. |
|
|
| Activation planning | 4.0 | 4.8 | Budgeted activation, selections, package request, graph neighborhoods, paths, ranking, metrics, multi-scenario evaluation fixtures | Wire semantic-index-assisted retrieval into runtime planning. |
|
|
| Local persistence | 3.7 | 4.5 | File-backed graph store, JSONL event log, audit sink, atomic JSON writes, metadata migration diagnostics, export, repair diagnostics | Add executable migrations, compaction/retention utilities, and stronger corruption recovery. |
|
|
| Policy, review, and audit | 3.9 | 5.0 | Operation points, review records, audit schema, queryable audit sinks, denials, redaction, fake external policy/audit adapters | Add live policy adapter boundary and enforceable audit retention policy. |
|
|
| Observability and operations | 3.6 | 4.8 | Health report, config diagnostics, adapter status, fake telemetry audit sink, operational recipe | Add metrics/event export and deployable health/readiness binding. |
|
|
| Markitect interop | 3.7 | 4.5 | Local validation, package request/response envelopes, fake compiler | Add optional live Markitect compiler adapter and contract compatibility suite. |
|
|
| Kontextual/Infospace interop | 3.3 | 4.5 | Delegation envelope, fake runtime registry, activation quality report fixture, adapter compatibility manifests | Add live/fake delegation scenarios and broader Infospace restart reports. |
|
|
| Testing and evaluation | 4.1 | 4.5 | 70 deterministic tests over runtime, CLI, adapters, policy, activation, lifecycle, service, fakes, and evaluation scenarios | Add larger regression corpus and threshold trend reports. |
|
|
| Service readiness | 4.2 | 4.8 | Service contracts, full local runner parity, health, config, adapter conformance, fake pack | Add optional framework binding and deployable readiness endpoints. |
|
|
| Developer experience | 4.1 | 4.5 | README, package map, CLI examples, persistence/policy/interop/service/lifecycle/fake-pack docs, operational recipe | Add troubleshooting matrix and embedded-service examples. |
|
|
|
|
## Assessment
|
|
|
|
The project has crossed the local integration-readiness threshold. The runtime
|
|
envelopes, policy/review model, profile-derived configuration, lifecycle rules,
|
|
local persistence diagnostics, queryable audit path, fake external pack
|
|
manifests, and conformance helpers form a solid integration boundary.
|
|
|
|
The biggest optimization opportunity is now the next operational layer:
|
|
turning diagnostic-only durability into operator actions, adding optional
|
|
deployable service bindings, and testing live or live-shaped adapters behind
|
|
the same conformance suite as the fake pack.
|
|
|
|
## Completed Refinement Workplan
|
|
|
|
`PMEM-WP-0011` moved the score from 3.8 to 4.0 by adding:
|
|
|
|
- full local service runner parity for `SERVICE_OPERATIONS`;
|
|
- service-covered `package.compile`, `lifecycle.apply`, and `audit.query`;
|
|
- queryable audit sinks with retention metadata;
|
|
- local-store atomic JSON writes, migration diagnostics, and corrupt-record
|
|
repair diagnostics;
|
|
- three evaluation scenario families covering policy denial, lifecycle rules,
|
|
event-path activation, semantic-index hints, and budget pressure;
|
|
- adapter pack manifests and explicit missing-capability diagnostics;
|
|
- an operational end-to-end recipe.
|
|
|
|
## Recommended Next Refinement
|
|
|
|
Create and execute `PMEM-WP-0012`: live-adapter and service-binding readiness.
|
|
|
|
Highest-value tasks:
|
|
|
|
- Add an optional framework binding around `LocalServiceRunner` with health and
|
|
readiness endpoints.
|
|
- Add executable local-store migrations, not only diagnostics.
|
|
- Add live-shaped Markitect/Kontextual adapter fixtures behind the manifest and
|
|
conformance suite.
|
|
- Add audit retention enforcement and telemetry export drills.
|
|
- Grow the evaluation corpus into threshold reports that can catch regressions.
|
|
|
|
## Score Movement Gates
|
|
|
|
Achieved overall score **4.0** when:
|
|
|
|
- Service runner handles every operation in `SERVICE_OPERATIONS`.
|
|
- Audit query and lifecycle apply are covered through service contracts.
|
|
- Local persistence has migration diagnostics.
|
|
- Evaluation fixtures cover at least three profile/graph families.
|
|
|
|
Move overall score to **4.3+** when:
|
|
|
|
- Live optional Markitect or Kontextual adapter can be used behind the same
|
|
conformance suite as the fake pack.
|
|
- Operational docs include a deployable service binding or a clear embedding
|
|
recipe.
|
|
|
|
Move overall score to **4.7+** only when:
|
|
|
|
- Live adapter behavior, telemetry, audit retention, migration, and evaluation
|
|
gates are all exercised by repeatable tests or documented operator drills.
|