Context Recall Evaluator

The Context Recall Evaluator measures whether the retrieved contexts contain all the information necessary to accurately answer the user's question. It helps evaluate the effectiveness of the retrieval component in RAG systems.

Context Recall Evaluator component interface and configuration

Evaluation Notice: Low Context Recall scores indicate that your retrieval system may not be surfacing adequate information to answer user queries, which can lead to hallucinations or incomplete responses.

Component Inputs

Retrieved Contexts: The collection of retrieved passages or documents used to generate the response
Example: ["The Amazon rainforest is being cleared primarily for cattle ranching and soy production.", "Illegal logging and mining also contribute significantly to deforestation in the region."]
Expected Contexts: The reference or expected contexts that should be retrieved
Example: ["The Amazon rainforest is being cleared primarily for cattle ranching and soy production.", "Illegal logging and mining also contribute significantly to deforestation in the region.", "Infrastructure development, including road construction, is also a contributing factor to Amazon deforestation."]
Distance Measure: The method used to calculate the similarity between retrieved and expected contexts
Example: "Semantic similarity"

Component Outputs

Evaluation Result: Qualitative assessment of what information is present or missing from the contexts
Example: "The contexts contain information about cattle ranching, soy production, logging, and mining but lack information on infrastructure development."

Score Interpretation

High Context Recall (0.7-1.0)

The retrieved contexts contain most or all of the necessary information to answer the question accurately

Example Score: 0.95
This indicates excellent retrieval of relevant information

Moderate Context Recall (0.3-0.7)

The retrieved contexts contain some but not all of the necessary information

Example Score: 0.55
This indicates partial retrieval of relevant information

Low Context Recall (0.0-0.3)

The retrieved contexts are missing significant information needed to answer the question

Example Score: 0.15
This indicates poor retrieval of relevant information

Implementation Example

from ragas.metrics import ContextRecall

# Create the metric
context_recall = ContextRecall()

# Use in evaluation
from datasets import Dataset
from ragas import evaluate

eval_dataset = Dataset.from_dict({
    "question": ["What are the major causes of deforestation 
    in the Amazon rainforest?"],
    "contexts": [["The Amazon rainforest is being cleared 
    primarily for cattle ranching and soy production.", 
    "Illegal logging and mining also contribute significantly 
    to deforestation in the region."]],
    "answer": ["The major causes of deforestation in the 
    Amazon rainforest include cattle ranching, soy farming,
     logging operations, and mining activities."]
})

result = evaluate(
    eval_dataset,
    metrics=[context_recall]
)
print(result)

Use Cases

Retrieval System Evaluation: Assess the effectiveness of different retrieval mechanisms in finding relevant information
Knowledge Base Assessment: Determine if your knowledge base contains sufficient information to answer common questions
Query Reformulation: Guide the development of query reformulation strategies to improve information retrieval
Chunking Strategy Optimization: Compare different document chunking approaches to maximize information retrieval
Identifying Knowledge Gaps: Discover areas where your knowledge base needs additional content

Useful Resources

Best Practices

Use Context Recall in combination with Context Precision to fully evaluate your retrieval system
Establish minimum acceptable Context Recall thresholds based on your application's requirements
Analyze patterns in missing information to guide knowledge base improvements
Compare Context Recall across different retrieval methods and vector database configurations
Consider adjusting the number of retrieved documents based on Context Recall performance

Documentation