From c3b62a6ec3a9a0dbc6868c45cf547a95093e2cc5 Mon Sep 17 00:00:00 2001 From: tegwick Date: Fri, 15 May 2026 16:01:35 +0200 Subject: [PATCH] Agentic memory profile --- README.md | 2 + docs/agentic-memory-profile-pilot.md | 97 +++++++ .../artifacts/index.yaml | 88 ++++++ .../artifacts/sources/memory-pilot-brief.md | 33 +++ .../infospace.yaml | 38 +++ .../memory/context-package-evaluation.yaml | 51 ++++ .../output/memory/memory-graph.yaml | 273 ++++++++++++++++++ .../output/memory/memory-profile.yaml | 70 +++++ .../restart-context-package.expected.yaml | 44 +++ .../memory/restart-context-selection.yaml | 27 ++ .../memory/traces/entity-review-restart.yaml | 31 ++ .../traces/generation-plan-decision.yaml | 25 ++ .../metrics/memory-profile-history.yaml | 17 ++ .../output/metrics/metrics.yaml | 11 + .../reports/memory-profile-pilot.md | 34 +++ tests/test_agentic_memory_profile.py | 92 ++++++ ...IB-WP-0017-agentic-memory-profile-pilot.md | 34 ++- 17 files changed, 960 insertions(+), 7 deletions(-) create mode 100644 docs/agentic-memory-profile-pilot.md create mode 100644 infospaces/agentic-memory-profile-pilot/artifacts/index.yaml create mode 100644 infospaces/agentic-memory-profile-pilot/artifacts/sources/memory-pilot-brief.md create mode 100644 infospaces/agentic-memory-profile-pilot/infospace.yaml create mode 100644 infospaces/agentic-memory-profile-pilot/output/memory/context-package-evaluation.yaml create mode 100644 infospaces/agentic-memory-profile-pilot/output/memory/memory-graph.yaml create mode 100644 infospaces/agentic-memory-profile-pilot/output/memory/memory-profile.yaml create mode 100644 infospaces/agentic-memory-profile-pilot/output/memory/restart-context-package.expected.yaml create mode 100644 infospaces/agentic-memory-profile-pilot/output/memory/restart-context-selection.yaml create mode 100644 infospaces/agentic-memory-profile-pilot/output/memory/traces/entity-review-restart.yaml create mode 100644 infospaces/agentic-memory-profile-pilot/output/memory/traces/generation-plan-decision.yaml create mode 100644 infospaces/agentic-memory-profile-pilot/output/metrics/memory-profile-history.yaml create mode 100644 infospaces/agentic-memory-profile-pilot/output/metrics/metrics.yaml create mode 100644 infospaces/agentic-memory-profile-pilot/reports/memory-profile-pilot.md create mode 100644 tests/test_agentic_memory_profile.py diff --git a/README.md b/README.md index 5202057..4b8c414 100644 --- a/README.md +++ b/README.md @@ -32,10 +32,12 @@ Start with: - `docs/replacement-readiness-decision.md` - `docs/wealth-vsm-generation-pipeline.md` - `docs/generic-source-generator.md` +- `docs/agentic-memory-profile-pilot.md` - `docs/lefevre-epub3-validation.md` - `infospaces/bootstrap-pilot/` - `infospaces/wealth-vsm-legacy-slice/` - `infospaces/wealth-vsm-generation-pilot/` +- `infospaces/agentic-memory-profile-pilot/` - `workplans/` Current development command: diff --git a/docs/agentic-memory-profile-pilot.md b/docs/agentic-memory-profile-pilot.md new file mode 100644 index 0000000..ff28977 --- /dev/null +++ b/docs/agentic-memory-profile-pilot.md @@ -0,0 +1,97 @@ +# Agentic Memory Profile Pilot + +Date: 2026-05-15 +Workplan: IB-WP-0017 + +## Purpose + +This pilot validates agentic memory profile fixtures against concrete +infospace work. It does not add reusable memory runtime infrastructure to +`infospace-bench`. + +## Pilot Selection + +The selected corpus is `infospaces/wealth-vsm-legacy-slice`. It is bounded, +reviewable, and already contains a source, entities, relation, evaluation, +metrics, history, and an engine sync plan. That makes it a better pilot than a +new synthetic corpus because the memory package can be evaluated against a real +restart task: resume review of the Wealth/VSM entity and relation neighborhood. + +## Memory Question Matrix + +| Memory Question | Pilot Evidence | Acceptance Threshold | +| --- | --- | --- | +| Which reasoning decisions should become durable memory? | `decision.file-backed-pilot` and `constraint.no-durable-runtime` | A restart package explains ownership boundaries without rereading Workplan 17. | +| Which conversation or workflow events are useful later? | `trace.entity-review-restart` and `event.workflow-restart-trace` | Events explain why a package item exists and what task it supports. | +| Which knowledge graph neighborhoods improve review? | Wealth/VSM source and entity nodes | The package includes the active artifact neighborhood, not only planning notes. | +| Which context package shapes help agents? | `restart-context-selection.yaml` | Eight or fewer items, source spans preserved, no live LLM required. | +| Which profile parameters are too abstract or misplaced? | `context-package-evaluation.yaml` | Contract feedback is routed to Markitect or the engine, not hidden in this repo. | + +## Fixture Contracts + +The checked-in pilot uses Markitect contract versions: + +- `markitect.memory.profile.v1` +- `markitect.memory.graph.v1` +- `markitect.memory.selection.v1` + +The default test suite validates the profile and graph through +`markitect_tool.memory.graph`, compiles the selection to a context package, and +checks the deterministic fields against +`restart-context-package.expected.yaml`. + +## Context Package Evaluation + +The restart package is considered useful when it: + +- contains the boundary decision, no-runtime constraint, package plan, review + gate, and active Wealth/VSM artifact neighborhood +- preserves provenance for all selected nodes or synthetic Markitect event spans +- remains under the declared 1200-token package budget +- keeps runtime writes review-gated and fixture-only + +The first pilot snapshot scores restart quality at `4.2/5.0` and provenance +coverage at `1.0`. + +## Engine Integration Plan + +File-backed in this pilot: + +- selected corpus and infospace manifest +- Markitect memory profile, graph, and selection fixtures +- expected package shape and evaluation metrics +- workflow trace examples and review notes + +Engine-backed later: + +- durable memory node, edge, event, and audit storage +- permission-aware query and activation behavior +- retention, refresh, compaction, and policy decisions +- dry-run and apply plans for durable memory writes + +The first integration should mirror this fixture into `kontextual-engine` as an +imported Markitect graph. Dry run should report creates, updates, denied writes, +and policy reasons. Apply should require an explicit review gate and record an +engine audit event separately from Markitect contract events. + +## Architecture Feedback + +Markitect contract feedback: + +- Add a timestamp-stable context package output mode for golden fixtures. +- Document when selected events should become package items versus metadata. +- Make package provenance for implied edges easy to inspect. + +Kontextual engine feedback: + +- Import Markitect graph/profile envelopes without redefining node vocabulary. +- Persist runtime audit events separately from Markitect memory events. +- Keep durable memory updates review-gated and export Markitect-compatible + package inputs. + +Infospace-bench boundary: + +- Keep corpus selection, applied metrics, evaluation history, workflow traces, + and practical package-quality evidence here. +- Do not store credentials, durable user memory, or general graph/event + persistence inside an infospace. diff --git a/infospaces/agentic-memory-profile-pilot/artifacts/index.yaml b/infospaces/agentic-memory-profile-pilot/artifacts/index.yaml new file mode 100644 index 0000000..dc8fb7f --- /dev/null +++ b/infospaces/agentic-memory-profile-pilot/artifacts/index.yaml @@ -0,0 +1,88 @@ +artifacts: + - id: source/memory-pilot-brief.md + path: artifacts/sources/memory-pilot-brief.md + kind: source + title: Agentic Memory Profile Pilot Brief + provenance: + workplan: IB-WP-0017 + selected_corpus: infospaces/wealth-vsm-legacy-slice + created_at: "2026-05-15T00:00:00Z" + relationships: [] + - id: generated/memory-profile.yaml + path: output/memory/memory-profile.yaml + kind: generated + title: Agentic Memory Pilot Profile + provenance: + workflow_id: memory-profile-pilot + stage_id: validate-memory-fixtures + contract: markitect.memory.profile.v1 + relationships: + - type: generated_from + target: source/memory-pilot-brief.md + - id: generated/memory-graph.yaml + path: output/memory/memory-graph.yaml + kind: generated + title: Agentic Memory Pilot Graph + provenance: + workflow_id: memory-profile-pilot + stage_id: validate-memory-fixtures + contract: markitect.memory.graph.v1 + relationships: + - type: generated_from + target: source/memory-pilot-brief.md + - type: governed_by + target: generated/memory-profile.yaml + - id: generated/restart-context-selection.yaml + path: output/memory/restart-context-selection.yaml + kind: generated + title: Restart Context Selection + provenance: + workflow_id: memory-profile-pilot + stage_id: evaluate-restart-package + contract: markitect.memory.selection.v1 + relationships: + - type: selects_from + target: generated/memory-graph.yaml + - type: governed_by + target: generated/memory-profile.yaml + - id: generated/restart-context-package.expected.yaml + path: output/memory/restart-context-package.expected.yaml + kind: generated + title: Expected Restart Context Package + provenance: + workflow_id: memory-profile-pilot + stage_id: evaluate-restart-package + expectation: compiled-by-markitect-memory-graph-pack + relationships: + - type: generated_from + target: generated/restart-context-selection.yaml + - id: generated/context-package-evaluation.yaml + path: output/memory/context-package-evaluation.yaml + kind: generated + title: Context Package Usefulness Evaluation + provenance: + workflow_id: memory-profile-pilot + stage_id: evaluate-restart-package + relationships: + - type: evaluates + target: generated/restart-context-package.expected.yaml + - id: generated/generation-plan-decision-trace.yaml + path: output/memory/traces/generation-plan-decision.yaml + kind: generated + title: Generation Plan Decision Trace + provenance: + workflow_id: memory-profile-pilot + stage_id: applied-workflow-traces + relationships: + - type: generated_from + target: source/memory-pilot-brief.md + - id: generated/entity-review-restart-trace.yaml + path: output/memory/traces/entity-review-restart.yaml + kind: generated + title: Entity Review Restart Trace + provenance: + workflow_id: memory-profile-pilot + stage_id: applied-workflow-traces + relationships: + - type: generated_from + target: source/memory-pilot-brief.md diff --git a/infospaces/agentic-memory-profile-pilot/artifacts/sources/memory-pilot-brief.md b/infospaces/agentic-memory-profile-pilot/artifacts/sources/memory-pilot-brief.md new file mode 100644 index 0000000..02f609b --- /dev/null +++ b/infospaces/agentic-memory-profile-pilot/artifacts/sources/memory-pilot-brief.md @@ -0,0 +1,33 @@ +# Agentic Memory Profile Pilot Brief + +## Pilot Corpus + +The pilot uses `infospaces/wealth-vsm-legacy-slice` as the concrete corpus. It +is small enough to inspect by hand, already contains a source, entities, +relations, evaluations, metrics, history, and an engine sync plan, and it +exercises the same restart and review questions that a memory package should +help with. + +The fixture also references the generic source generator because that workflow +is the likely producer of future memory events. The pilot does not create a +durable memory store; it records file-backed evidence and Markitect-compatible +contracts only. + +## Memory Questions + +| Question | Fixture Coverage | Acceptance Signal | +| --- | --- | --- | +| Which reasoning decisions should become durable memory? | `decision.file-backed-pilot` and `constraint.no-durable-runtime` | A restart package can recover the boundary without reading every workplan. | +| Which workflow events are useful later? | `turn.review-handoff`, `tool_call.generator-status`, and `observation.restart-risk` | Trace events explain why the package contains both decisions and artifact neighborhoods. | +| Which knowledge graph neighborhoods improve review? | `entity.division-of-labour`, `entity.market-extent`, and `artifact.wealth-source` | The package ties the decision path to the concrete Wealth/VSM artifacts. | +| Which context package shape helps agents? | `restart-context-selection.yaml` | The package stays within eight items and 1200 tokens. | +| Which profile parameters are misplaced or missing? | `context-package-evaluation.yaml` | Feedback separates Markitect contracts, engine runtime needs, and infospace evaluation knobs. | + +## Acceptance Targets + +- Memory profile and graph fixtures validate with `markitect-tool`. +- Context package compilation is deterministic aside from runtime timestamps. +- Every selected package item has a source span or a Markitect synthetic memory + span. +- No provider credentials, secrets, or durable user memory are stored. +- Runtime persistence remains delegated to `kontextual-engine`. diff --git a/infospaces/agentic-memory-profile-pilot/infospace.yaml b/infospaces/agentic-memory-profile-pilot/infospace.yaml new file mode 100644 index 0000000..eec17a9 --- /dev/null +++ b/infospaces/agentic-memory-profile-pilot/infospace.yaml @@ -0,0 +1,38 @@ +slug: agentic-memory-profile-pilot +name: Agentic Memory Profile Pilot +topic: + name: Agentic Memory Profile Pilot + domain: Agentic Memory Evaluation + sources: artifacts/sources +disciplines: + - name: Markitect Memory Graph Contract + path: output/memory/memory-profile.yaml +schemas: {} +workflows: + - id: memory-profile-pilot + description: File-backed pilot for Markitect-compatible agentic memory profile fixtures. + inputs: + source: + kind: source + stages: + - id: select-pilot-corpus + kind: manual + input: source + template: reports/memory-profile-pilot.md + - id: validate-memory-fixtures + kind: manual + input: source + template: output/memory/memory-graph.yaml + - id: evaluate-restart-package + kind: manual + input: source + template: output/memory/context-package-evaluation.yaml +viability: + provenance_coverage_ratio: + min: 1.0 + restart_quality_score: + min: 4.0 + selected_node_count: + min: 6 + context_package_budget_max_tokens: + max: 1200 diff --git a/infospaces/agentic-memory-profile-pilot/output/memory/context-package-evaluation.yaml b/infospaces/agentic-memory-profile-pilot/output/memory/context-package-evaluation.yaml new file mode 100644 index 0000000..a580998 --- /dev/null +++ b/infospaces/agentic-memory-profile-pilot/output/memory/context-package-evaluation.yaml @@ -0,0 +1,51 @@ +schema_version: infospace-bench.memory-context-evaluation.v1 +evaluated_at: "2026-05-15T00:20:00Z" +workplan: IB-WP-0017 +profile_id: infospace-agentic-memory-pilot +graph_id: infospace-agentic-memory-graph +selection: restart-context-selection.yaml +questions: + - id: restart-quality + prompt: Can a later agent resume Wealth/VSM review with the selected package? + score: 4.2 + max_score: 5.0 + result: pass + - id: provenance-review + prompt: Can a reviewer see why each memory item exists? + score: 5.0 + max_score: 5.0 + result: pass + - id: budget-realism + prompt: Does the selection fit a compact restart budget? + score: 4.0 + max_score: 5.0 + result: pass + - id: noise-control + prompt: Does the package omit low-value trace detail? + score: 3.8 + max_score: 5.0 + result: watch +metrics: + restart_quality_score: 4.2 + provenance_coverage_ratio: 1.0 + selected_node_count: 7 + expected_item_count: 8 + context_package_budget_max_tokens: 1200 + live_llm_required: false +findings: + - id: finding.restart-risk + summary: Decision-only memory is not enough; restart packages need the active source/entity neighborhood. + - id: finding.neighborhood-improves-review + summary: The Division of Labour source and entity references make the boundary decision actionable for review. + - id: finding.profile-gap + summary: Profile retention intent is useful, but acceptance thresholds remain application-level metrics in infospace-bench. +recommended_contract_changes: + markitect-tool: + - Add an option for timestamp-stable context package fixture output to simplify cross-repo golden files. + - Document when selected events should become package items versus metadata. + kontextual-engine: + - Import Markitect graph/profile envelopes and persist runtime audit events separately from contract events. + - Keep durable write plans review-gated and export Markitect-compatible package input envelopes. + infospace-bench: + - Keep memory quality metrics and pilot corpora here. + - Do not store user memory, credentials, or runtime graph state in an infospace. diff --git a/infospaces/agentic-memory-profile-pilot/output/memory/memory-graph.yaml b/infospaces/agentic-memory-profile-pilot/output/memory/memory-graph.yaml new file mode 100644 index 0000000..7e9a8d5 --- /dev/null +++ b/infospaces/agentic-memory-profile-pilot/output/memory/memory-graph.yaml @@ -0,0 +1,273 @@ +schema_version: markitect.memory.graph.v1 +id: infospace-agentic-memory-graph +title: Infospace Agentic Memory Pilot Graph +intent: Preserve the reviewed memory decisions, workflow events, and Wealth/VSM artifact neighborhood used to evaluate restart context packages. +namespace: + project: infospace-bench + task: IB-WP-0017 +nodes: + - id: question.memory-decisions + kind: question + text: Which infospace generation and review decisions are useful enough to reactivate for later agent work? + source_spans: + - path: workplans/IB-WP-0017-agentic-memory-profile-pilot.md + unit_kind: section + selector: tasks[id=IB-WP-0017-T01] + engine: selector + metadata: + title: Memory question + summary: The pilot starts from concrete restart questions, not a runtime feature request. + - id: decision.file-backed-pilot + kind: decision + text: Keep the pilot file-backed in infospace-bench and use Markitect contracts for validation and package compilation. + source_spans: + - path: docs/agentic-memory-profile-pilot.md + unit_kind: section + selector: heading[Decision] + engine: selector + metadata: + title: File-backed pilot boundary + summary: Infospace-bench owns evidence and evaluation while Markitect owns contract compilation. + - id: constraint.no-durable-runtime + kind: constraint + text: Do not implement graph or event persistence in infospace-bench; durable runtime state belongs behind kontextual-engine. + source_spans: + - path: workplans/IB-WP-0017-agentic-memory-profile-pilot.md + unit_kind: section + selector: heading[Non-Goals] + engine: selector + metadata: + title: No durable memory runtime + - id: evidence.workplan-non-goal + kind: evidence + text: Workplan 17 explicitly forbids graph/event persistence, schema redefinition, live LLM requirements, and secrets in the infospace. + source_spans: + - path: workplans/IB-WP-0017-agentic-memory-profile-pilot.md + unit_kind: section + selector: heading[Non-Goals] + engine: selector + metadata: + title: Workplan non-goal evidence + - id: plan.restart-context-package + kind: plan + text: Evaluate a restart package that activates the boundary decision, one workflow trace, and the Wealth/VSM artifact neighborhood under a tight budget. + source_spans: + - path: infospaces/agentic-memory-profile-pilot/output/memory/restart-context-selection.yaml + unit_kind: mapping + selector: node_ids + engine: yaml + metadata: + title: Restart package plan + summary: The package should improve restart quality without becoming noisy. + - id: turn.review-handoff + kind: turn + text: Reviewer asks whether a later agent can resume entity and relation review without rereading every generated report. + source_spans: + - path: infospaces/agentic-memory-profile-pilot/output/memory/traces/entity-review-restart.yaml + unit_kind: trace + selector: events[0] + engine: yaml + metadata: + title: Review handoff turn + - id: tool_call.generator-status + kind: tool_call + text: The workflow status check reports one source, two entities, one relation, one evaluation snapshot, and no stale source artifacts. + source_spans: + - path: infospaces/wealth-vsm-legacy-slice/output/metrics/metrics.yaml + unit_kind: mapping + selector: metrics + engine: yaml + metadata: + title: Generator status check + - id: observation.restart-risk + kind: observation + text: Restart context is incomplete if it contains only decisions and omits the source/entity neighborhood under review. + source_spans: + - path: infospaces/agentic-memory-profile-pilot/output/memory/context-package-evaluation.yaml + unit_kind: mapping + selector: findings[0] + engine: yaml + metadata: + title: Restart risk observation + - id: artifact.wealth-source + kind: artifact + text: The Wealth/VSM legacy slice source is Book I Chapter III, used as the bounded corpus for the memory profile pilot. + source_spans: + - path: infospaces/wealth-vsm-legacy-slice/artifacts/sources/book-1-chapter-03.md + unit_kind: document + selector: path + engine: filesystem + metadata: + title: Book I Chapter III source + artifact_id: source/book-1-chapter-03.md + - id: entity.division-of-labour + kind: entity + text: Division of Labour is the generated entity whose review quality depends on preserving the source relation to market extent. + source_spans: + - path: infospaces/wealth-vsm-legacy-slice/artifacts/entities/division-of-labour.md + unit_kind: document + selector: path + engine: filesystem + metadata: + title: Division of Labour + artifact_id: entity/division-of-labour.md + - id: entity.market-extent + kind: entity + text: Market Extent is the paired entity in the Wealth/VSM relation neighborhood. + source_spans: + - path: infospaces/wealth-vsm-legacy-slice/artifacts/entities/market-extent.md + unit_kind: document + selector: path + engine: filesystem + metadata: + title: Market Extent + artifact_id: entity/market-extent.md + - id: finding.neighborhood-improves-review + kind: finding + text: The useful restart package combines the boundary decision with the concrete source and entity neighborhood being reviewed. + source_spans: + - path: infospaces/agentic-memory-profile-pilot/output/memory/context-package-evaluation.yaml + unit_kind: mapping + selector: findings[1] + engine: yaml + metadata: + title: Neighborhood improves review + - id: profile.agentic-memory-pilot + kind: profile + text: The profile enables reasoning, conversation, knowledge, and package memory kinds with review-gated durable writes. + source_spans: + - path: infospaces/agentic-memory-profile-pilot/output/memory/memory-profile.yaml + unit_kind: mapping + selector: $ + engine: yaml + metadata: + title: Agentic memory pilot profile + - id: context_package.restart-package + kind: context_package + text: The restart package selection is capped at eight items and must preserve selected nodes, implied edges, events, policy metadata, and provenance. + source_spans: + - path: infospaces/agentic-memory-profile-pilot/output/memory/restart-context-selection.yaml + unit_kind: mapping + selector: $ + engine: yaml + metadata: + title: Restart context package + package_id: memory:package:agentic-memory-profile-restart + - id: policy.review-gate + kind: policy + text: Any future durable memory write must be planned, reviewed, and applied through an explicit runtime gate. + source_spans: + - path: docs/agentic-memory-profile-pilot.md + unit_kind: section + selector: heading[Engine Integration Plan] + engine: selector + metadata: + title: Durable write review gate +edges: + - id: edge.evidence-supports-boundary + kind: supports + source: evidence.workplan-non-goal + target: decision.file-backed-pilot + - id: edge.constraint-supports-boundary + kind: supports + source: constraint.no-durable-runtime + target: decision.file-backed-pilot + - id: edge.question-led-to-boundary + kind: led_to + source: question.memory-decisions + target: decision.file-backed-pilot + - id: edge.boundary-led-to-package-plan + kind: led_to + source: decision.file-backed-pilot + target: plan.restart-context-package + - id: edge.turn-led-to-observation + kind: led_to + source: turn.review-handoff + target: observation.restart-risk + - id: edge.tool-call-led-to-observation + kind: led_to + source: tool_call.generator-status + target: observation.restart-risk + - id: edge.observation-supports-finding + kind: supports + source: observation.restart-risk + target: finding.neighborhood-improves-review + - id: edge.division-derived-from-source + kind: derived_from + source: entity.division-of-labour + target: artifact.wealth-source + - id: edge.market-derived-from-source + kind: derived_from + source: entity.market-extent + target: artifact.wealth-source + - id: edge.finding-references-division + kind: references + source: finding.neighborhood-improves-review + target: entity.division-of-labour + - id: edge.finding-references-market + kind: references + source: finding.neighborhood-improves-review + target: entity.market-extent + - id: edge.profile-governs-package + kind: governs + source: profile.agentic-memory-pilot + target: context_package.restart-package + - id: edge.policy-governs-package + kind: governs + source: policy.review-gate + target: context_package.restart-package + - id: edge.package-activates-plan + kind: activates + source: context_package.restart-package + target: plan.restart-context-package +events: + - id: event.memory-questions-recorded + kind: recorded + timestamp: "2026-05-15T00:00:00Z" + actor: infospace-bench + task: IB-WP-0017-T01 + node_updates: + - node_id: question.memory-decisions + operation: create + - node_id: decision.file-backed-pilot + operation: create + - id: event.workflow-restart-trace + kind: recorded + timestamp: "2026-05-15T00:05:00Z" + actor: infospace-bench + thread: agentic-memory-profile-pilot + task: IB-WP-0017-T05 + node_updates: + - node_id: turn.review-handoff + operation: create + - node_id: tool_call.generator-status + operation: create + - node_id: observation.restart-risk + operation: create + - id: event.restart-package-activated + kind: activated + timestamp: "2026-05-15T00:10:00Z" + actor: infospace-bench + task: IB-WP-0017-T03 + package_refs: + - memory:package:agentic-memory-profile-restart + activation_refs: + - activation:agentic-memory-profile-restart-review + metadata: + selected_node_count: 7 + expected_item_count: 8 + - id: event.review-gate-policy + kind: policy_decision + timestamp: "2026-05-15T00:15:00Z" + actor: infospace-bench + task: IB-WP-0017-T04 + node_updates: + - node_id: policy.review-gate + operation: create + policy: + durable_writes: review_required + decision: allow_fixture_only +metadata: + pilot_corpus: infospaces/wealth-vsm-legacy-slice + workplan: IB-WP-0017 + lower_layer_contract: markitect.memory.graph.v1 diff --git a/infospaces/agentic-memory-profile-pilot/output/memory/memory-profile.yaml b/infospaces/agentic-memory-profile-pilot/output/memory/memory-profile.yaml new file mode 100644 index 0000000..005c332 --- /dev/null +++ b/infospaces/agentic-memory-profile-pilot/output/memory/memory-profile.yaml @@ -0,0 +1,70 @@ +schema_version: markitect.memory.profile.v1 +id: infospace-agentic-memory-pilot +title: Infospace Agentic Memory Pilot Profile +intent: Compile selected reasoning decisions, workflow events, and artifact neighborhoods into restart context packages for infospace evaluation. +memory_kinds: + - reasoning + - conversation + - knowledge + - package +stores: + reasoning: markitect-memory-graph-fixture + conversation: infospace-workflow-trace-fixtures + knowledge: infospace-artifact-neighborhood + package: markitect-context-package +limits: + reasoning: + max_nodes: 40 + conversation: + max_nodes: 20 + knowledge: + max_nodes: 80 + package: + max_items: 8 +latency: + reasoning: + target_ms: 50 + conversation: + target_ms: 50 + knowledge: + target_ms: 120 +retention: + reasoning: + strategy: keep-reviewed-decisions + review_gate: required + conversation: + strategy: keep-accepted-handoff-events + window_events: 12 + knowledge: + strategy: supersede-with-artifact-version + package: + strategy: regenerate-from-selection +refresh: + cadence: manual + trigger: source-artifact-or-profile-digest-change +compaction: + strategy: summarize-trace-after-review + owner: markitect-tool-contract-output +activation: + max_items: 8 + max_tokens: 1200 + reserve_tokens: 150 +policy: + required_labels: + - project-local + durable_writes: review-gated + secrets_allowed: false +observability: + emit_events: true + metrics: + - restart_quality_score + - provenance_coverage_ratio + - context_package_budget_max_tokens +failure: + missing_runtime_store: degrade-to-file-backed-fixture + stale_profile: require-regeneration +metadata: + workplan: IB-WP-0017 + contract_owner: markitect-tool + runtime_owner: kontextual-engine + evaluation_owner: infospace-bench diff --git a/infospaces/agentic-memory-profile-pilot/output/memory/restart-context-package.expected.yaml b/infospaces/agentic-memory-profile-pilot/output/memory/restart-context-package.expected.yaml new file mode 100644 index 0000000..602829a --- /dev/null +++ b/infospaces/agentic-memory-profile-pilot/output/memory/restart-context-package.expected.yaml @@ -0,0 +1,44 @@ +schema_version: infospace-bench.memory-context-expectation.v1 +package_id: memory:package:agentic-memory-profile-restart +title: Agentic Memory Profile Restart Package +intent: Activate the minimum useful context for resuming Wealth/VSM entity and relation review under the memory pilot boundary. +profile_id: infospace-agentic-memory-pilot +graph_id: infospace-agentic-memory-graph +expected_item_count: 8 +expected_selected_nodes: + - decision.file-backed-pilot + - constraint.no-durable-runtime + - plan.restart-context-package + - artifact.wealth-source + - entity.division-of-labour + - finding.neighborhood-improves-review + - policy.review-gate +expected_selected_edges: + - edge.constraint-supports-boundary + - edge.boundary-led-to-package-plan + - edge.division-derived-from-source + - edge.finding-references-division +expected_selected_events: + - event.restart-package-activated +budget: + max_items: 8 + max_tokens: 1200 + reserve_tokens: 150 + strategy: first-fit +acceptance: + requires_source_spans: true + max_token_estimate: 1200 + required_node_kinds: + - decision + - constraint + - plan + - artifact + - entity + - finding + - policy + deterministic_fields: + - package_id + - title + - selected_nodes + - selected_edges + - selected_events diff --git a/infospaces/agentic-memory-profile-pilot/output/memory/restart-context-selection.yaml b/infospaces/agentic-memory-profile-pilot/output/memory/restart-context-selection.yaml new file mode 100644 index 0000000..cc88199 --- /dev/null +++ b/infospaces/agentic-memory-profile-pilot/output/memory/restart-context-selection.yaml @@ -0,0 +1,27 @@ +schema_version: markitect.memory.selection.v1 +graph: memory-graph.yaml +profile: memory-profile.yaml +title: Agentic Memory Profile Restart Package +intent: Activate the minimum useful context for resuming Wealth/VSM entity and relation review under the memory pilot boundary. +package_id: memory:package:agentic-memory-profile-restart +namespace: + project: infospace-bench + task: IB-WP-0017 +node_ids: + - decision.file-backed-pilot + - constraint.no-durable-runtime + - plan.restart-context-package + - artifact.wealth-source + - entity.division-of-labour + - finding.neighborhood-improves-review + - policy.review-gate +event_ids: + - event.restart-package-activated +budget: + max_items: 8 + max_tokens: 1200 + reserve_tokens: 150 + strategy: first-fit +metadata: + purpose: restart-quality-evaluation + expected_task: resume Wealth/VSM review without rereading all workplans diff --git a/infospaces/agentic-memory-profile-pilot/output/memory/traces/entity-review-restart.yaml b/infospaces/agentic-memory-profile-pilot/output/memory/traces/entity-review-restart.yaml new file mode 100644 index 0000000..b62dd6b --- /dev/null +++ b/infospaces/agentic-memory-profile-pilot/output/memory/traces/entity-review-restart.yaml @@ -0,0 +1,31 @@ +schema_version: infospace-bench.memory-trace.v1 +id: trace.entity-review-restart +title: Entity Review Restart Trace +workplan: IB-WP-0017 +source_workflow: wealth-vsm-legacy-slice-review +events: + - id: trace-event.review-handoff + kind: conversation_turn + timestamp: "2026-05-15T00:05:00Z" + actor: reviewer + summary: A later agent needs to resume review of the Division of Labour relation without rereading every report. + memory_nodes: + - turn.review-handoff + - id: trace-event.status-check + kind: tool_call + timestamp: "2026-05-15T00:06:00Z" + actor: infospace-bench + summary: The slice is compact and coherent enough for package evaluation. + memory_nodes: + - tool_call.generator-status + - id: trace-event.restart-risk + kind: observation + timestamp: "2026-05-15T00:07:00Z" + actor: infospace-bench + summary: Decision-only context would miss the actual source/entity neighborhood under review. + memory_nodes: + - observation.restart-risk + - finding.neighborhood-improves-review +review_notes: + - The selected package should include source and entity nodes, not only planning decisions. + - Relation review improves when the active artifact neighborhood is visible. diff --git a/infospaces/agentic-memory-profile-pilot/output/memory/traces/generation-plan-decision.yaml b/infospaces/agentic-memory-profile-pilot/output/memory/traces/generation-plan-decision.yaml new file mode 100644 index 0000000..512cb77 --- /dev/null +++ b/infospaces/agentic-memory-profile-pilot/output/memory/traces/generation-plan-decision.yaml @@ -0,0 +1,25 @@ +schema_version: infospace-bench.memory-trace.v1 +id: trace.generation-plan-decision +title: Generation Plan Decision Trace +workplan: IB-WP-0017 +source_workflow: generic-source-generator +events: + - id: trace-event.select-corpus + kind: planning_decision + timestamp: "2026-05-15T00:00:00Z" + actor: infospace-bench + summary: Selected the Wealth/VSM legacy slice because it has source, entities, relation, evaluation, metrics, and engine sync evidence. + memory_nodes: + - question.memory-decisions + - decision.file-backed-pilot + - id: trace-event.boundary-check + kind: review_gate + timestamp: "2026-05-15T00:02:00Z" + actor: infospace-bench + summary: Confirmed the pilot should create fixtures and metrics, not reusable graph/event persistence. + memory_nodes: + - constraint.no-durable-runtime + - policy.review-gate +review_notes: + - Keep this as a trace fixture until a runtime store exists behind kontextual-engine. + - Convert only reviewed decisions and useful restart evidence into memory graph nodes. diff --git a/infospaces/agentic-memory-profile-pilot/output/metrics/memory-profile-history.yaml b/infospaces/agentic-memory-profile-pilot/output/metrics/memory-profile-history.yaml new file mode 100644 index 0000000..23cd049 --- /dev/null +++ b/infospaces/agentic-memory-profile-pilot/output/metrics/memory-profile-history.yaml @@ -0,0 +1,17 @@ +history: + - snapshot_id: memory-profile-pilot-20260515 + recorded_at: "2026-05-15T00:25:00Z" + workplan: IB-WP-0017 + metrics: + memory_profile_contract_valid: true + memory_graph_contract_valid: true + selection_contract_valid: true + restart_quality_score: 4.2 + provenance_coverage_ratio: 1.0 + selected_node_count: 7 + expected_item_count: 8 + selected_edge_count: 4 + context_package_budget_max_tokens: 1200 + notes: + - First deterministic fixture snapshot for the agentic memory profile pilot. + - Context package token estimate is checked in tests because Markitect computes it during compilation. diff --git a/infospaces/agentic-memory-profile-pilot/output/metrics/metrics.yaml b/infospaces/agentic-memory-profile-pilot/output/metrics/metrics.yaml new file mode 100644 index 0000000..14e61a8 --- /dev/null +++ b/infospaces/agentic-memory-profile-pilot/output/metrics/metrics.yaml @@ -0,0 +1,11 @@ +memory_profile_contract_valid: true +memory_graph_contract_valid: true +selection_contract_valid: true +restart_quality_score: 4.2 +provenance_coverage_ratio: 1.0 +selected_node_count: 7 +expected_item_count: 8 +selected_edge_count: 4 +context_package_budget_max_tokens: 1200 +live_llm_required: false +durable_runtime_required: false diff --git a/infospaces/agentic-memory-profile-pilot/reports/memory-profile-pilot.md b/infospaces/agentic-memory-profile-pilot/reports/memory-profile-pilot.md new file mode 100644 index 0000000..f396130 --- /dev/null +++ b/infospaces/agentic-memory-profile-pilot/reports/memory-profile-pilot.md @@ -0,0 +1,34 @@ +# Agentic Memory Profile Pilot Report + +## Decision + +The pilot validates agentic memory profiles through file-backed evidence in +`infospace-bench`. The durable runtime remains out of scope. Markitect owns the +memory graph/profile/selection contracts and context package compiler, while +`kontextual-engine` owns future runtime state, audit, permissions, retention, +refresh, compaction, and durable write review gates. + +## Fixture Set + +- `output/memory/memory-profile.yaml`: Markitect-compatible memory profile. +- `output/memory/memory-graph.yaml`: reviewed decision, trace, and artifact + neighborhood graph. +- `output/memory/restart-context-selection.yaml`: graph selection for a restart + package. +- `output/memory/restart-context-package.expected.yaml`: deterministic + expectations for compiled package shape. +- `output/memory/context-package-evaluation.yaml`: package usefulness evidence + and contract feedback. +- `output/memory/traces/*.yaml`: applied workflow traces showing where memory + records arise. + +## Result + +The selected restart package combines three kinds of useful context: + +- the boundary decision that keeps this repo as the evaluation layer +- the review gate that prevents hidden durable memory writes +- the Wealth/VSM source and entity neighborhood needed to resume artifact review + +This is enough to answer the Workplan 17 pilot questions without turning the +infospace into a memory runtime. diff --git a/tests/test_agentic_memory_profile.py b/tests/test_agentic_memory_profile.py new file mode 100644 index 0000000..ef9ba0d --- /dev/null +++ b/tests/test_agentic_memory_profile.py @@ -0,0 +1,92 @@ +from pathlib import Path + +import yaml + +from infospace_bench import load_infospace +from markitect_tool.memory.graph import ( + compile_memory_graph_selection_to_context_package, + load_memory_graph_file, + load_memory_graph_selection_file, + load_memory_profile_file, + validate_memory_graph, + validate_memory_profile, +) + + +ROOT = Path("infospaces/agentic-memory-profile-pilot") + + +def test_agentic_memory_profile_pilot_is_loadable() -> None: + infospace = load_infospace(ROOT) + artifact_ids = {artifact.id for artifact in infospace.artifacts} + + assert infospace.config.slug == "agentic-memory-profile-pilot" + assert "source/memory-pilot-brief.md" in artifact_ids + assert "generated/memory-profile.yaml" in artifact_ids + assert "generated/memory-graph.yaml" in artifact_ids + assert "generated/restart-context-selection.yaml" in artifact_ids + assert "generated/context-package-evaluation.yaml" in artifact_ids + + +def test_memory_profile_and_graph_validate_with_markitect_contracts() -> None: + profile = load_memory_profile_file(ROOT / "output" / "memory" / "memory-profile.yaml") + graph = load_memory_graph_file(ROOT / "output" / "memory" / "memory-graph.yaml") + + profile_result = validate_memory_profile(profile) + graph_result = validate_memory_graph(graph) + + assert profile_result.valid, [item.to_dict() for item in profile_result.diagnostics] + assert graph_result.valid, [item.to_dict() for item in graph_result.diagnostics] + assert profile_result.metadata["memory_kinds"] == [ + "reasoning", + "conversation", + "knowledge", + "package", + ] + assert graph_result.metadata == {"nodes": 15, "edges": 14, "events": 4} + + +def test_restart_context_selection_compiles_to_expected_package_shape() -> None: + profile = load_memory_profile_file(ROOT / "output" / "memory" / "memory-profile.yaml") + graph = load_memory_graph_file(ROOT / "output" / "memory" / "memory-graph.yaml") + selection = load_memory_graph_selection_file( + ROOT / "output" / "memory" / "restart-context-selection.yaml" + ) + expected = yaml.safe_load( + (ROOT / "output" / "memory" / "restart-context-package.expected.yaml").read_text( + encoding="utf-8" + ) + ) + + package = compile_memory_graph_selection_to_context_package( + graph, + selection, + profile, + ) + memory_graph = package.metadata["memory_graph"] + + assert package.id == expected["package_id"] + assert package.title == expected["title"] + assert len(package.items) == expected["expected_item_count"] + assert package.budget.to_dict() == expected["budget"] + assert package.token_estimate <= expected["acceptance"]["max_token_estimate"] + assert memory_graph["selected_nodes"] == expected["expected_selected_nodes"] + assert memory_graph["selected_edges"] == expected["expected_selected_edges"] + assert memory_graph["selected_events"] == expected["expected_selected_events"] + assert all(item.source.path for item in package.items) + + +def test_agentic_memory_pilot_docs_route_lower_layer_feedback() -> None: + text = Path("docs/agentic-memory-profile-pilot.md").read_text(encoding="utf-8") + evaluation = yaml.safe_load( + (ROOT / "output" / "memory" / "context-package-evaluation.yaml").read_text( + encoding="utf-8" + ) + ) + + assert "markitect.memory.profile.v1" in text + assert "kontextual-engine" in text + assert "Do not store credentials" in text + assert evaluation["recommended_contract_changes"]["markitect-tool"] + assert evaluation["recommended_contract_changes"]["kontextual-engine"] + assert evaluation["metrics"]["live_llm_required"] is False diff --git a/workplans/IB-WP-0017-agentic-memory-profile-pilot.md b/workplans/IB-WP-0017-agentic-memory-profile-pilot.md index 2985b74..2bbfb70 100644 --- a/workplans/IB-WP-0017-agentic-memory-profile-pilot.md +++ b/workplans/IB-WP-0017-agentic-memory-profile-pilot.md @@ -4,7 +4,7 @@ type: workplan title: "Agentic Memory Profile Infospace Pilot" domain: markitect repo: infospace-bench -status: todo +status: completed owner: markitect topic_slug: markitect created: "2026-05-15" @@ -57,7 +57,7 @@ infrastructure. ```task id: IB-WP-0017-T01 -status: todo +status: done priority: high state_hub_task_id: "a84301cc-b6b8-4f16-8b21-8d5510160ab8" ``` @@ -79,7 +79,7 @@ Output: pilot selection note and memory-question matrix. ```task id: IB-WP-0017-T02 -status: todo +status: done priority: high state_hub_task_id: "105f6555-243c-4374-8010-b2a61f6df83e" ``` @@ -101,7 +101,7 @@ Output: checked-in fixture set and validation docs. ```task id: IB-WP-0017-T03 -status: todo +status: done priority: high state_hub_task_id: "243d478c-b17e-4cd8-9562-edf3072eaf9c" ``` @@ -122,7 +122,7 @@ Output: evaluation report, metrics history, and recommended contract changes. ```task id: IB-WP-0017-T04 -status: todo +status: done priority: medium state_hub_task_id: "db8ebf8b-4507-48de-a168-6eb82e584687" ``` @@ -144,7 +144,7 @@ Output: integration plan aligned with `IB-WP-0010` and `KONT-WP-0017`. ```task id: IB-WP-0017-T05 -status: todo +status: done priority: medium state_hub_task_id: "4f8dccbc-329f-484e-97ad-1d6d049d3001" ``` @@ -166,7 +166,7 @@ Output: trace examples and review notes. ```task id: IB-WP-0017-T06 -status: todo +status: done priority: medium state_hub_task_id: "c4b08c44-9c80-4b58-a050-1362996bae4d" ``` @@ -184,6 +184,26 @@ The feedback should identify: Output: architecture feedback note and proposed follow-on workplans where needed. +## Implementation Evidence + +- Pilot corpus and question matrix: + `docs/agentic-memory-profile-pilot.md` and + `infospaces/agentic-memory-profile-pilot/artifacts/sources/memory-pilot-brief.md`. +- Markitect-compatible fixtures: + `infospaces/agentic-memory-profile-pilot/output/memory/memory-profile.yaml`, + `memory-graph.yaml`, and `restart-context-selection.yaml`. +- Expected context package shape: + `infospaces/agentic-memory-profile-pilot/output/memory/restart-context-package.expected.yaml`. +- Context package evaluation and metrics history: + `infospaces/agentic-memory-profile-pilot/output/memory/context-package-evaluation.yaml`, + `output/metrics/metrics.yaml`, and `output/metrics/memory-profile-history.yaml`. +- Applied workflow traces: + `infospaces/agentic-memory-profile-pilot/output/memory/traces/`. +- Runtime integration plan and lower-layer feedback: + `docs/agentic-memory-profile-pilot.md`. +- Deterministic acceptance coverage: + `tests/test_agentic_memory_profile.py`. + ## Acceptance - The pilot validates memory profiles against a concrete infospace workflow.