Astro Intelligence

Query Transform & Classifier API Reference

Query Transform & Classifier API Reference

API reference for the anchor.query module. For usage patterns and examples, see the Query Transform Guide and the Classifiers Guide.


Query Transformers

All transformers expose a transform(query: QueryBundle) -> list[QueryBundle] method. They accept callback functions for LLM generation so that anchor never calls an LLM directly.

HyDETransformer

Hypothetical Document Embeddings. Generates a hypothetical answer and uses it as the retrieval query.

class HyDETransformer(
    generate_fn: Callable[[str], str],
)
ParameterTypeDefaultDescription
generate_fnCallable[[str], str]requiredTakes query string, returns hypothetical document

transform(query)

Returns a single-element list. The output QueryBundle has query_str set to the hypothetical document and metadata keys original_query and transform = "hyde".


MultiQueryTransformer

Generates multiple query variations for broader retrieval coverage.

class MultiQueryTransformer(
    generate_fn: Callable[[str, int], list[str]],
    num_queries: int = 3,
)
ParameterTypeDefaultDescription
generate_fnCallable[[str, int], list[str]]requiredTakes query string and count, returns variations
num_queriesint3Number of variations to generate

transform(query)

Returns a list of N+1 QueryBundle objects: the original query as the first element, followed by N generated variations. Each variation carries metadata keys original_query, transform = "multi_query", and variation_index.


DecompositionTransformer

Breaks a complex query into simpler sub-questions.

class DecompositionTransformer(
    generate_fn: Callable[[str], list[str]],
)
ParameterTypeDefaultDescription
generate_fnCallable[[str], list[str]]requiredTakes query string, returns sub-questions

transform(query)

Returns a list of QueryBundle objects, one per sub-question. Each carries metadata keys parent_query, transform = "decomposition", and sub_question_index.


StepBackTransformer

Generates a more abstract version of the query alongside the original.

class StepBackTransformer(
    generate_fn: Callable[[str], str],
)
ParameterTypeDefaultDescription
generate_fnCallable[[str], str]requiredTakes query string, returns abstract version

transform(query)

Returns a two-element list: [original_query, step_back_query]. The step-back query carries metadata keys original_query and transform = "step_back".


Conversation-Aware Transformers

ConversationRewriter

Rewrites a query using conversation history via a user-supplied callback. When chat_history is empty, returns the original query unchanged.

class ConversationRewriter(
    rewrite_fn: Callable[[str, list[ConversationTurn]], str],
)
ParameterTypeDefaultDescription
rewrite_fnCallable[[str, list[ConversationTurn]], str]requiredTakes query string and history, returns rewritten query

transform(query)

Returns a single-element list. If query.chat_history is non-empty, the output QueryBundle has metadata keys original_query and transform = "conversation_rewrite". The embedding and chat_history fields are preserved from the original query.


ContextualQueryTransformer

Wraps another transformer, prepending conversation context to the query before delegation.

class ContextualQueryTransformer(
    inner: QueryTransformer,
    context_prefix: str = "Given the conversation context: ",
)
ParameterTypeDefaultDescription
innerQueryTransformerrequiredWrapped transformer to delegate to
context_prefixstr"Given the conversation context: "Text prepended before the summary

transform(query)

When chat_history is non-empty, builds a summary string from the conversation turns in "role: content | role: content" format, prepends context_prefix, and delegates the augmented query to the inner transformer. When history is empty, delegates directly without modification.

Returns: The output of inner.transform().


QueryTransformPipeline

Chains multiple query transformers and deduplicates results. Each transformer is applied to every query produced by the previous stage.

class QueryTransformPipeline(
    transformers: list[QueryTransformer],
)
ParameterTypeDefaultDescription
transformerslist[QueryTransformer]requiredOrdered sequence of transformers

Methods

transform(query)

Apply all transformers in sequence and deduplicate by query_str.

ParameterTypeDefaultDescription
queryQueryBundlerequiredOriginal query

Returns: list[QueryBundle] -- deduplicated list of transformed queries.

atransform(query) (async)

Async version. Transformers implementing AsyncQueryTransformer are called via atransform; others fall back to synchronous transform.

ParameterTypeDefaultDescription
queryQueryBundlerequiredOriginal query

Returns: list[QueryBundle]


Query Classifiers

All classifiers implement the QueryClassifier protocol:

def classify(self, query: QueryBundle) -> str

KeywordClassifier

Classifies queries by matching keywords in the query string. Rules are evaluated in insertion order; first match wins.

class KeywordClassifier(
    rules: dict[str, list[str]],
    default: str,
    case_sensitive: bool = False,
)
ParameterTypeDefaultDescription
rulesdict[str, list[str]]requiredLabel-to-keywords mapping
defaultstrrequiredFallback label when no rule matches
case_sensitiveboolFalseWhether matching is case-sensitive

classify(query)

Scans query.query_str for keywords. Returns the label of the first matching rule, or default.


CallbackClassifier

Delegates classification to a user-supplied callback.

class CallbackClassifier(
    classify_fn: Callable[[QueryBundle], str],
)
ParameterTypeDefaultDescription
classify_fnCallable[[QueryBundle], str]requiredClassification callback

classify(query)

Returns the string label from the callback.


EmbeddingClassifier

Classifies by comparing query embedding to labelled centroid embeddings.

class EmbeddingClassifier(
    centroids: dict[str, list[float]],
    distance_fn: Callable[[list[float], list[float]], float] | None = None,
)
ParameterTypeDefaultDescription
centroidsdict[str, list[float]]requiredLabel-to-centroid embedding mapping
distance_fnCallable[[list[float], list[float]], float] | Nonecosine similaritySimilarity function (higher = closer)

classify(query)

Compares query.embedding to each centroid and returns the label with the highest similarity score.

Raises: ValueError if query.embedding is None.


Pipeline Integration Functions

query_transform_step(name, transformer, retriever, top_k=10)

Create a pipeline step that transforms the query, retrieves for each variant, and merges results via Reciprocal Rank Fusion (RRF).

from anchor.pipeline import query_transform_step
ParameterTypeDefaultDescription
namestrrequiredDescriptive name for this step
transformerQueryTransformerrequiredTransformer to expand the query
retrieverRetrieverrequiredRetriever to run per expanded query
top_kint10Max items to retrieve per variant

Returns: PipelineStep

classified_retriever_step(name, classifier, retrievers, default=None, top_k=10)

Create a pipeline step that classifies the query and routes to the matching retriever.

from anchor.pipeline import classified_retriever_step
ParameterTypeDefaultDescription
namestrrequiredHuman-readable step name
classifierQueryClassifierrequiredClassifier returning a label string
retrieversdict[str, Retriever]requiredLabel-to-retriever mapping
defaultstr | NoneNoneFallback label when classified label not found
top_kint10Maximum items to retrieve

Returns: PipelineStep Raises: RetrieverError if classified label has no matching retriever and no default is configured.

On this page