generated from coulomb/repo-seed
105 lines
5.2 KiB
Markdown
105 lines
5.2 KiB
Markdown
# Agent Working Memory Thought Experiment
|
|
|
|
Date: 2026-05-04
|
|
|
|
## Prompt
|
|
|
|
Imagine an agent memory architecture optimized for short and mid term
|
|
efficiency across different reaction speeds, timescales, and cost constraints,
|
|
while preserving a personality-like long term identity and learning memory that
|
|
evolves over time.
|
|
|
|
This is a wishful design note. It is not a promise that every layer belongs in
|
|
`markitect-tool` core.
|
|
|
|
## Memory Layers
|
|
|
|
| Layer | Reaction Speed | Lifetime | Cost Shape | Useful Contents |
|
|
| --- | --- | --- | --- | --- |
|
|
| Reflex context | milliseconds to seconds | one response | highest compute pressure | current user turn, active file, tool output, immediate errors |
|
|
| Working set | seconds to minutes | one task step | medium compute pressure | current plan, nearby code/docs, selected snippets, constraints |
|
|
| Episode memory | minutes to hours | one chat/thread/work session | moderate storage | decisions, explored dead ends, task state, user preferences for this effort |
|
|
| Project semantic memory | hours to weeks | project lifecycle | cheap local storage, occasional refresh | architecture docs, stable APIs, workplans, policies, source maps, examples |
|
|
| Identity and learning memory | weeks to years | agent/persona/project relationship | expensive to write, cheap to retrieve | values, style, durable preferences, repeated collaboration patterns |
|
|
| Policy and safety memory | all speeds | explicit freshness windows | must be rechecked before use | labels, trust zones, subject identity, decision ids, authorization context |
|
|
|
|
The main design pressure is not just retrieval quality. It is deciding when a
|
|
memory should be cheap, fast, provisional, discardable, re-checkable, or
|
|
durable.
|
|
|
|
## Wish List
|
|
|
|
If a few things could be manifested instantly:
|
|
|
|
1. A hot context router would keep the current working set small and sharp. It
|
|
would load exact spans and summaries only when they are needed, then release
|
|
them when the task moves on.
|
|
2. Every context item would carry provenance, policy metadata, freshness,
|
|
token cost, and a reason it was included.
|
|
3. Memory would be layered, not monolithic. Fast scratch memory, package memory,
|
|
project memory, and identity memory would have different write thresholds.
|
|
4. Durable memory writes would be explicit and inspectable. The agent could
|
|
propose "this seems worth remembering" rather than silently mutating itself.
|
|
5. Long term identity would be represented as small, versioned principles and
|
|
preferences, not as a bag of transcripts.
|
|
6. The system would support both activation and forgetting. Dropping context
|
|
should be as first-class as loading it.
|
|
7. Retrieval would be policy-aware at both package creation and activation.
|
|
A package created yesterday should not be blindly activated today.
|
|
8. Summaries would be multi-resolution: one-line, section-level, package-level,
|
|
project-level, and identity-level, each with its own freshness and source
|
|
links.
|
|
9. Local deterministic memory would work offline. LLM-assisted summaries,
|
|
embeddings, external stores, and enterprise authorization would be optional
|
|
adapters.
|
|
10. The agent could maintain a small "self continuity" layer: tone,
|
|
collaboration norms, current mission, and durable lessons, all with user
|
|
review and rollback.
|
|
|
|
## Dynamic Balances
|
|
|
|
The useful architecture has several moving balances:
|
|
|
|
- Fast recall vs. exact provenance: the reflex layer can be approximate, but
|
|
activation packages must be explainable.
|
|
- Rich context vs. token budget: every item needs a token estimate and a
|
|
summary so a caller can choose the right resolution.
|
|
- Stability vs. freshness: project architecture memory is useful only if it
|
|
can be refreshed against current snapshots.
|
|
- Personalization vs. consent: long term identity memory should require higher
|
|
write friction than task-local memory.
|
|
- Local autonomy vs. enterprise control: local package creation must work
|
|
offline, while activation can re-check policy when a gateway is available.
|
|
- Retrieval breadth vs. actionability: packages should be small enough to
|
|
activate directly and large enough to explain why they matter.
|
|
|
|
## Implications For Markitect
|
|
|
|
Markitect should own the local, inspectable package layer:
|
|
|
|
- schema for context packages
|
|
- package creation from selectors, search, and manifests
|
|
- deterministic summary layers
|
|
- activation/deactivation/refresh/explain lifecycle
|
|
- source spans, token estimates, provenance, and policy metadata
|
|
- optional adapter boundaries for assisted summaries, embeddings, vector
|
|
stores, and external policy decision points
|
|
|
|
Markitect should not own hidden agent identity memory as an ambient service.
|
|
It can provide explicit package formats and namespaces for identity-like
|
|
memory, but durable writes should remain visible, reviewable, and policy-bound.
|
|
|
|
## Best First Step
|
|
|
|
The first implementation should be boring in the right way:
|
|
|
|
- local files and local SQLite index only
|
|
- deterministic summaries only
|
|
- explicit package files in `.markitect/context`
|
|
- policy metadata preserved and optionally rechecked
|
|
- no live flex-auth, vector database, embedding provider, LLM provider, or
|
|
hidden agent state
|
|
|
|
That gives future systems a trustworthy memory substrate without making the
|
|
core repo pretend to be a full cognitive architecture.
|