Files
infospace-bench/docs/agentic-memory-profile-pilot.md
2026-05-15 16:01:35 +02:00

4.1 KiB

Agentic Memory Profile Pilot

Date: 2026-05-15 Workplan: IB-WP-0017

Purpose

This pilot validates agentic memory profile fixtures against concrete infospace work. It does not add reusable memory runtime infrastructure to infospace-bench.

Pilot Selection

The selected corpus is infospaces/wealth-vsm-legacy-slice. It is bounded, reviewable, and already contains a source, entities, relation, evaluation, metrics, history, and an engine sync plan. That makes it a better pilot than a new synthetic corpus because the memory package can be evaluated against a real restart task: resume review of the Wealth/VSM entity and relation neighborhood.

Memory Question Matrix

Memory Question Pilot Evidence Acceptance Threshold
Which reasoning decisions should become durable memory? decision.file-backed-pilot and constraint.no-durable-runtime A restart package explains ownership boundaries without rereading Workplan 17.
Which conversation or workflow events are useful later? trace.entity-review-restart and event.workflow-restart-trace Events explain why a package item exists and what task it supports.
Which knowledge graph neighborhoods improve review? Wealth/VSM source and entity nodes The package includes the active artifact neighborhood, not only planning notes.
Which context package shapes help agents? restart-context-selection.yaml Eight or fewer items, source spans preserved, no live LLM required.
Which profile parameters are too abstract or misplaced? context-package-evaluation.yaml Contract feedback is routed to Markitect or the engine, not hidden in this repo.

Fixture Contracts

The checked-in pilot uses Markitect contract versions:

  • markitect.memory.profile.v1
  • markitect.memory.graph.v1
  • markitect.memory.selection.v1

The default test suite validates the profile and graph through markitect_tool.memory.graph, compiles the selection to a context package, and checks the deterministic fields against restart-context-package.expected.yaml.

Context Package Evaluation

The restart package is considered useful when it:

  • contains the boundary decision, no-runtime constraint, package plan, review gate, and active Wealth/VSM artifact neighborhood
  • preserves provenance for all selected nodes or synthetic Markitect event spans
  • remains under the declared 1200-token package budget
  • keeps runtime writes review-gated and fixture-only

The first pilot snapshot scores restart quality at 4.2/5.0 and provenance coverage at 1.0.

Engine Integration Plan

File-backed in this pilot:

  • selected corpus and infospace manifest
  • Markitect memory profile, graph, and selection fixtures
  • expected package shape and evaluation metrics
  • workflow trace examples and review notes

Engine-backed later:

  • durable memory node, edge, event, and audit storage
  • permission-aware query and activation behavior
  • retention, refresh, compaction, and policy decisions
  • dry-run and apply plans for durable memory writes

The first integration should mirror this fixture into kontextual-engine as an imported Markitect graph. Dry run should report creates, updates, denied writes, and policy reasons. Apply should require an explicit review gate and record an engine audit event separately from Markitect contract events.

Architecture Feedback

Markitect contract feedback:

  • Add a timestamp-stable context package output mode for golden fixtures.
  • Document when selected events should become package items versus metadata.
  • Make package provenance for implied edges easy to inspect.

Kontextual engine feedback:

  • Import Markitect graph/profile envelopes without redefining node vocabulary.
  • Persist runtime audit events separately from Markitect memory events.
  • Keep durable memory updates review-gated and export Markitect-compatible package inputs.

Infospace-bench boundary:

  • Keep corpus selection, applied metrics, evaluation history, workflow traces, and practical package-quality evidence here.
  • Do not store credentials, durable user memory, or general graph/event persistence inside an infospace.