Context Engineering

"Context is the product. The LLM is just the consumer."

Context engineering is the discipline of designing and assembling the information that an LLM receives in its prompt. A well-engineered context window directly determines the quality of the model's output -- regardless of which model you use.

Why Context Matters

Large language models do not have memory. Every request starts from a blank slate. The only way an LLM "knows" anything about your user, your data, or your application is through the context you provide in the prompt.

This makes context the single most important lever for AI application quality. A mediocre model with excellent context will outperform a frontier model with poor context.

Context engineering answers questions like:

Which documents should the model see for this query?
How much conversation history fits in the window?
Should system instructions or retrieval results get priority when space is tight?
How do you prevent one source from crowding out another?

The Problem with Current Approaches

Most AI frameworks treat context assembly as an afterthought. RAG pipelines concatenate retrieval results into a prompt string. Chat frameworks append messages until the context window overflows. Memory systems operate in isolation from retrieval.

The result is context that is assembled ad hoc -- without token awareness, without priority ranking, and without diagnostics.

Feature	LangChain	LlamaIndex	mem0	anchor
Hybrid RAG (Dense + BM25 + RRF)	partial	yes	no	yes
Token-aware Memory	partial	no	yes	yes
Token Budget Management	no	no	no	yes
Provider-agnostic Formatting	no	no	no	yes
Protocol-based Plugins	no	partial	no	yes
Zero-config Defaults	no	no	yes	yes

The anchor Philosophy

anchor is built on three design principles that set it apart.

1. Model-Agnostic by Design

The core library never calls an LLM. It does not import openai, anthropic, or any model SDK. You provide the embedding function, the tokenizer, and the retrieval backend. The library handles assembly, ranking, and budgeting.

This means you can swap models, providers, or even embedding strategies without changing your context pipeline.

import math

# You own the embedding function -- any model, any API, any dimension
def my_embed_fn(text: str) -> list[float]:
    seed = sum(ord(c) for c in text) % 10000
    raw = [math.sin(seed * 1000 + i) for i in range(64)]
    norm = math.sqrt(sum(x * x for x in raw))
    return [x / norm for x in raw] if norm else raw

2. Token-Aware Everything

Every component in anchor is token-aware:

ContextItem carries a token_count field.
ContextWindow tracks used_tokens and remaining_tokens.
MemoryManager evicts oldest turns when the conversation exceeds its token budget.
TokenBudget allocates portions of the window to different source types with per-source caps and overflow strategies.

Nothing is concatenated blindly. The library always knows how many tokens remain and what was dropped.

3. Protocol-Based Plugins

Instead of deep inheritance hierarchies, anchor uses Python Protocols (PEP 544) for all extension points. Any object with the right method signatures works -- no base class required.

from anchor.protocols import Retriever

# This class satisfies the Retriever protocol without inheriting anything
class MyRetriever:
    def retrieve(self, query, top_k=10):
        return [...]  # Return list of ContextItem

This makes it trivial to integrate existing code, wrap third-party APIs, or test with simple stubs.

[!TIP] Duck typing for AI Protocols give you the benefits of interfaces (type safety, IDE autocompletion, runtime checking with isinstance) without the rigidity of inheritance. See the Protocols concept page for details.

What anchor Is Not

anchor is not:

An LLM wrapper -- it never generates text.
A vector database -- it provides in-memory stores for development and protocols for plugging in any backend.
An agent framework -- it assembles context for agents, but does not manage tool execution or planning.

It sits between your data sources and your LLM call, ensuring the model receives the best possible context within its token budget.

Core Data Flow

 Your Data Sources          anchor              Your LLM Call
 +-----------------+     +------------------+     +------------------+
 | Vector DB       | --> |                  |     |                  |
 | Conversation    | --> | ContextPipeline  | --> | Anthropic / OpenAI|
 | System Prompts  | --> |   .build()       |     |   API call       |
 | Tool Results    | --> |                  |     |                  |
 +-----------------+     +------------------+     +------------------+
                          - Priority ranking
                          - Token budgets
                          - Overflow tracking
                          - Provider formatting
                          - Diagnostics

Getting Started

To build your first context pipeline, see the Getting Started guide. For deeper architectural details, read the Architecture page.

Context Engineering

Context Engineering

Why Context Matters

The Problem with Current Approaches

The anchor Philosophy

1. Model-Agnostic by Design

2. Token-Aware Everything

3. Protocol-Based Plugins

What anchor Is Not

Core Data Flow

Getting Started

See Also

On this page