Memory Management

anchor provides a layered memory system that tracks conversation history, stores persistent facts, and integrates both into the context pipeline. This guide covers every component -- from the coordinator down to decay functions and garbage collection.

Architecture Overview

The memory system has three tiers:

Conversation memory -- a token-budgeted window of recent turns (SlidingWindowMemory or SummaryBufferMemory).
Persistent facts -- long-term entries in a MemoryEntryStore, managed through MemoryManager.
Graph memory -- an optional entity-relationship graph (SimpleGraphMemory) for structured knowledge.

MemoryManager sits on top, producing ContextItem objects for the pipeline.


MemoryManager
  |-- conversation: SlidingWindowMemory | SummaryBufferMemory
  |-- persistent_store: MemoryEntryStore (optional)
  `-- get_context_items() --> ContextItem[]

MemoryManager -- The Coordinator

MemoryManager is the single entry point for adding messages and retrieving context.

from anchor import MemoryManager

manager = MemoryManager(conversation_tokens=4096)
manager.add_user_message("What is context engineering?")
manager.add_assistant_message("It is the practice of assembling...")

items = manager.get_context_items()
print(len(items))  # 2 context items from conversation history

Pass a custom conversation backend to switch strategies:

from anchor import MemoryManager, SummaryBufferMemory

def compact(turns):
    return "; ".join(t.content[:40] for t in turns)

summary_mem = SummaryBufferMemory(max_tokens=2048, compact_fn=compact)
manager = MemoryManager(conversation_memory=summary_mem)

[!TIP] When conversation_memory is provided, the conversation_tokens, tokenizer, and on_evict parameters are ignored.

SlidingWindowMemory

A rolling window of conversation turns within a token budget. When a new turn would exceed the limit, oldest turns are evicted first.

from anchor import SlidingWindowMemory

window = SlidingWindowMemory(max_tokens=1024)
window.add_turn("user", "Hello!")
window.add_turn("assistant", "Hi there!")
print(window.total_tokens)  # current token usage

Eviction Callback

React to evicted turns (log them, archive them):

def on_evict(evicted_turns):
    for t in evicted_turns:
        print(f"Evicted: {t.role}: {t.content[:50]}")

window = SlidingWindowMemory(max_tokens=512, on_evict=on_evict)

Custom Eviction Policy and Recency Scorer

from anchor import (
    SlidingWindowMemory, ImportanceEviction, ExponentialRecencyScorer,
)

window = SlidingWindowMemory(
    max_tokens=1024,
    eviction_policy=ImportanceEviction(
        importance_fn=lambda turn: len(turn.content) / 100.0
    ),
    recency_scorer=ExponentialRecencyScorer(decay_rate=3.0),
)

SummaryBufferMemory

A two-tier memory: recent turns verbatim plus a running summary of evicted content. Exactly one of compact_fn or progressive_compact_fn must be provided.

Simple Compaction

from anchor import SummaryBufferMemory

def compact(turns):
    return "Summary: " + "; ".join(t.content[:60] for t in turns)

mem = SummaryBufferMemory(max_tokens=1024, compact_fn=compact)
mem.add_message("user", "Tell me about Python.")
mem.add_message("assistant", "Python is a high-level language...")
print(mem.summary)  # None until eviction occurs

Progressive Compaction

progressive_compact_fn also receives the previous summary, enabling incremental refinement:

from anchor import SummaryBufferMemory

def progressive(turns, previous_summary):
    new_content = "; ".join(t.content[:40] for t in turns)
    if previous_summary:
        return f"{previous_summary} | {new_content}"
    return new_content

mem = SummaryBufferMemory(max_tokens=512, progressive_compact_fn=progressive)

[!NOTE] If the compaction function raises an exception, the raw turn content is used as a fallback so evicted data is never lost.

SimpleGraphMemory

An in-memory directed graph for entity-relationship tracking without an external graph database.

from anchor import SimpleGraphMemory

graph = SimpleGraphMemory()
graph.add_entity("alice", {"type": "person", "role": "engineer"})
graph.add_entity("project-x", {"type": "project"})
graph.add_relationship("alice", "works_on", "project-x")

graph.add_entity("bob", {"type": "person"})
graph.add_relationship("bob", "works_on", "project-x")
graph.link_memory("alice", "mem-001")
graph.link_memory("project-x", "mem-002")

# BFS traversal: find related entities within 2 hops
related = graph.get_related_entities("alice", max_depth=2)
print(related)  # ['project-x', 'bob']

# Collect memory IDs for entity and its neighborhood
memory_ids = graph.get_related_memory_ids("alice", max_depth=2)
print(memory_ids)  # ['mem-001', 'mem-002']

[!TIP] add_relationship auto-creates nodes for source and target if they do not already exist. link_memory requires the entity to exist.

Eviction Policies

Three built-in policies implement the EvictionPolicy protocol.

FIFOEviction -- evicts oldest turns first (matches the default):

from anchor import FIFOEviction, SlidingWindowMemory

window = SlidingWindowMemory(max_tokens=1024, eviction_policy=FIFOEviction())

ImportanceEviction -- evicts lowest-scoring turns first:

from anchor import ImportanceEviction

policy = ImportanceEviction(importance_fn=lambda turn: len(turn.content) / 100.0)

PairedEviction -- evicts user+assistant pairs together so no orphaned questions remain:

from anchor import PairedEviction, SlidingWindowMemory

window = SlidingWindowMemory(max_tokens=1024, eviction_policy=PairedEviction())

Decay Functions and Recency Scorers

Decay functions compute retention scores for persistent memory entries. The garbage collector uses them to prune forgotten memories.

EbbinghausDecay -- forgetting curve R = e^(-t/S) where strength grows with access count:

from anchor import EbbinghausDecay
decay = EbbinghausDecay(base_strength=1.0, reinforcement_factor=0.5)

LinearDecay -- linear interpolation from 1.0 to 0.0. At half_life_hours the retention is 0.5:

from anchor import LinearDecay
decay = LinearDecay(half_life_hours=168.0)  # 7 days

Recency scorers control position-based scores within a sliding window:

from anchor import ExponentialRecencyScorer, LinearRecencyScorer

exp_scorer = ExponentialRecencyScorer(decay_rate=2.0)
print(exp_scorer.score(0, 10))   # ~0.0 (oldest)
print(exp_scorer.score(9, 10))   # 1.0  (newest)

lin_scorer = LinearRecencyScorer(min_score=0.5)
print(lin_scorer.score(0, 10))   # 0.5
print(lin_scorer.score(9, 10))   # 1.0

SimilarityConsolidator

Determines whether new memory entries should be added, merged, or skipped based on content hashing and cosine similarity.

import math
from anchor import SimilarityConsolidator, MemoryEntry

def embed_fn(text: str) -> list[float]:
    return [math.sin(i + len(text)) for i in range(8)]

consolidator = SimilarityConsolidator(
    embed_fn=embed_fn,
    similarity_threshold=0.85,
)

existing = [MemoryEntry(content="User prefers dark mode")]
new_entries = [
    MemoryEntry(content="User likes dark themes"),   # similar -> UPDATE
    MemoryEntry(content="User prefers dark mode"),    # exact dup -> NONE
    MemoryEntry(content="User works at Acme Corp"),   # new -> ADD
]

results = consolidator.consolidate(new_entries, existing)
for action, entry in results:
    print(action, entry.content if entry else "(skipped)")

[!CAUTION] The library never calls an LLM. You provide the embed_fn which can use any embedding provider (OpenAI, Cohere, local models, etc.).

MemoryGarbageCollector and GCStats

Prunes expired and decayed entries from a GarbageCollectableStore.

from anchor import MemoryGarbageCollector, EbbinghausDecay, InMemoryEntryStore

store = InMemoryEntryStore()
gc = MemoryGarbageCollector(store=store, decay=EbbinghausDecay())

stats = gc.collect(retention_threshold=0.1)
print(stats)  # GCStats(applied: expired_pruned=0, decayed_pruned=0, ...)

# Dry run -- identify what would be pruned without deleting
stats = gc.collect(retention_threshold=0.1, dry_run=True)
print(stats.total_pruned)

The collector works in two phases:

Expiry phase -- removes entries whose is_expired property is True.
Decay phase -- computes retention and removes entries below the threshold.

Both phases fire MemoryCallback hooks for observability.

Persistent Facts

MemoryManager manages persistent facts through a MemoryEntryStore. Content-hash deduplication prevents storing the same content twice.

from anchor import MemoryManager, MemoryType, InMemoryEntryStore

store = InMemoryEntryStore()
manager = MemoryManager(persistent_store=store)

# Add facts (returns existing entry if duplicate)
entry = manager.add_fact(
    "User prefers dark mode",
    tags=["preference"],
    memory_type=MemoryType.SEMANTIC,
)

# Search, update, delete
results = manager.get_relevant_facts("theme preference", top_k=3)
manager.update_fact(entry.id, "User prefers dark mode with blue accent")
manager.delete_fact(entry.id)
all_facts = manager.get_all_facts()

[!CAUTION] Calling add_fact without a configured persistent_store raises StorageError. Always pass a MemoryEntryStore to the constructor.

MemoryCallback and CallbackExtractor

MemoryCallback is a protocol for observing memory lifecycle events. Implement only the methods you care about:

class MyCallback:
    def on_eviction(self, turns, remaining_tokens):
        print(f"Evicted {len(turns)} turns, {remaining_tokens} tokens left")

    def on_decay_prune(self, pruned_entries, threshold):
        print(f"Pruned {len(pruned_entries)} entries below {threshold}")

CallbackExtractor delegates memory extraction to a user function:

from anchor import CallbackExtractor, MemoryType
from anchor.models.memory import ConversationTurn

def my_extractor(turns):
    return [
        {"content": "User prefers dark mode", "tags": ["preference"]},
        {"content": "User's name is Alice", "memory_type": "semantic"},
    ]

extractor = CallbackExtractor(extract_fn=my_extractor)
turns = [ConversationTurn(role="user", content="I prefer dark mode")]
entries = extractor.extract(turns)
print(entries[0].content)  # "User prefers dark mode"

Complete Example

A memory-aware pipeline with summarization, persistent facts, and garbage collection:

from anchor import (
    ContextPipeline, MemoryManager, SummaryBufferMemory,
    MemoryGarbageCollector, EbbinghausDecay, InMemoryEntryStore,
    QueryBundle,
)

store = InMemoryEntryStore()

def compact(turns):
    return "Previously: " + "; ".join(t.content[:50] for t in turns)

conversation = SummaryBufferMemory(max_tokens=2048, compact_fn=compact)
manager = MemoryManager(conversation_memory=conversation, persistent_store=store)

manager.add_fact("User prefers Python over JavaScript", tags=["preference"])
manager.add_user_message("How do I sort a list in Python?")
manager.add_assistant_message("Use sorted() or list.sort().")

pipeline = (
    ContextPipeline(max_tokens=4096)
    .with_memory(manager)
    .add_system_prompt("You are a helpful coding assistant.")
)
result = pipeline.build(QueryBundle(query_str="Show me an example"))
print(f"Context items: {len(result.window.items)}")

gc = MemoryGarbageCollector(store=store, decay=EbbinghausDecay())
stats = gc.collect(retention_threshold=0.1)
print(stats)

Next Steps

Memory API Reference -- constructor signatures, parameter tables, and method details for every memory class.
Pipeline Guide -- how memory integrates with the context pipeline.

Memory Management

On this page