Files
infospace-bench/workplans/IB-WP-0017-agentic-memory-profile-pilot.md
2026-05-15 16:01:35 +02:00

6.4 KiB

id, type, title, domain, repo, status, owner, topic_slug, created, updated, depends_on_workplans, related_workplans, state_hub_workstream_id
id type title domain repo status owner topic_slug created updated depends_on_workplans related_workplans state_hub_workstream_id
IB-WP-0017 workplan Agentic Memory Profile Infospace Pilot markitect infospace-bench completed markitect markitect 2026-05-15 2026-05-15
IB-WP-0010
IB-WP-0014
MKTT-WP-0016
KONT-WP-0017
b549b9fe-0ebe-4fa9-9d60-c317e44032f5

IB-WP-0017: Agentic Memory Profile Infospace Pilot

Goal

Validate the agentic memory graph/profile architecture in a concrete infospace without turning infospace-bench into a memory runtime.

markitect-tool should own graph/profile contracts and context-package compilation. kontextual-engine should own durable memory state and permission-aware runtime behavior. infospace-bench should own the applied pilot: corpus selection, project configuration, evaluation metrics, workflow traces, and practical feedback to lower layers.

Intent

Use a real infospace workflow to answer practical questions:

  • Which reasoning decisions should become durable memory?
  • Which conversation or workflow events are useful later?
  • Which knowledge graph neighborhoods improve infospace evaluation and review?
  • Which context package shapes are helpful for agents?
  • Which profile parameters are too abstract, missing, or misplaced?

This workplan should produce evidence and fixtures, not reusable runtime infrastructure.

Non-Goals

  • Do not implement graph/event persistence in infospace-bench.
  • Do not redefine Markitect memory graph/profile schemas.
  • Do not require live LLM calls in the default test suite.
  • Do not store secrets, provider credentials, or durable user memory inside an infospace.
  • Do not make an infospace backend act as a general memory service.

T01 - Select pilot corpus and memory questions

id: IB-WP-0017-T01
status: done
priority: high
state_hub_task_id: "a84301cc-b6b8-4f16-8b21-8d5510160ab8"

Choose a bounded pilot, likely one of the existing ebook or reference infospaces, and define the memory questions it should exercise.

The pilot should cover:

  • reasoning decisions during generation or review
  • conversation/workflow events worth replaying
  • stable knowledge facts and relationships
  • activation needs for later agent work
  • evaluation questions and acceptance thresholds

Output: pilot selection note and memory-question matrix.

T02 - Create Markitect-compatible memory fixtures

id: IB-WP-0017-T02
status: done
priority: high
state_hub_task_id: "105f6555-243c-4374-8010-b2a61f6df83e"

Create fixture data that uses the markitect-tool memory graph/profile contracts.

Fixtures should include:

  • decision graph snippets from infospace planning or review
  • conversation/workflow event paths
  • knowledge graph neighborhoods over real infospace artifacts
  • memory profile examples with budgets, retention intent, and policy metadata
  • expected context package outputs

Output: checked-in fixture set and validation docs.

T03 - Evaluate context package usefulness

id: IB-WP-0017-T03
status: done
priority: high
state_hub_task_id: "243d478c-b17e-4cd8-9562-edf3072eaf9c"

Use Markitect-generated context packages in concrete infospace tasks.

Evaluate:

  • whether activated context improves task restart quality
  • whether provenance is enough for review
  • whether token budgets are realistic
  • whether decision paths and knowledge neighborhoods are understandable
  • where packages become noisy or incomplete

Output: evaluation report, metrics history, and recommended contract changes.

T04 - Plan optional engine-backed runtime integration

id: IB-WP-0017-T04
status: done
priority: medium
state_hub_task_id: "db8ebf8b-4507-48de-a168-6eb82e584687"

Design an integration scenario where kontextual-engine provides durable memory graph/event state behind an infospace workflow.

The plan should define:

  • what remains file-backed in the pilot
  • what would be engine-backed later
  • sync and dry-run behavior
  • provenance and audit expectations
  • review gates for durable memory writes

Output: integration plan aligned with IB-WP-0010 and KONT-WP-0017.

T05 - Add applied memory workflow traces

id: IB-WP-0017-T05
status: done
priority: medium
state_hub_task_id: "4f8dccbc-329f-484e-97ad-1d6d049d3001"

Capture applied workflow traces that show how memory records arise from real infospace work.

Examples:

  • generation plan decisions
  • entity/relation review decisions
  • evaluation failures and fixes
  • source chunk triage
  • agent handoff or restart context

Output: trace examples and review notes.

T06 - Feed architecture findings back to lower layers

id: IB-WP-0017-T06
status: done
priority: medium
state_hub_task_id: "c4b08c44-9c80-4b58-a050-1362996bae4d"

Summarize what the pilot proves or falsifies.

The feedback should identify:

  • Markitect contract gaps
  • engine runtime needs
  • profile parameters that are too vague or too detailed
  • useful default package shapes
  • application-specific behavior that should stay in infospace-bench

Output: architecture feedback note and proposed follow-on workplans where needed.

Implementation Evidence

  • Pilot corpus and question matrix: docs/agentic-memory-profile-pilot.md and infospaces/agentic-memory-profile-pilot/artifacts/sources/memory-pilot-brief.md.
  • Markitect-compatible fixtures: infospaces/agentic-memory-profile-pilot/output/memory/memory-profile.yaml, memory-graph.yaml, and restart-context-selection.yaml.
  • Expected context package shape: infospaces/agentic-memory-profile-pilot/output/memory/restart-context-package.expected.yaml.
  • Context package evaluation and metrics history: infospaces/agentic-memory-profile-pilot/output/memory/context-package-evaluation.yaml, output/metrics/metrics.yaml, and output/metrics/memory-profile-history.yaml.
  • Applied workflow traces: infospaces/agentic-memory-profile-pilot/output/memory/traces/.
  • Runtime integration plan and lower-layer feedback: docs/agentic-memory-profile-pilot.md.
  • Deterministic acceptance coverage: tests/test_agentic_memory_profile.py.

Acceptance

  • The pilot validates memory profiles against a concrete infospace workflow.
  • All reusable contract feedback is routed to markitect-tool.
  • Durable runtime needs are routed to kontextual-engine.
  • infospace-bench remains the application/evaluation layer.
  • Default tests and fixtures stay deterministic and credential-free.