Context Precision Evaluator
The Context Precision Evaluator measures how relevant the retrieved contexts are to the user's query. It helps identify cases where irrelevant or tangential information is included in the retrieved contexts.

Context Precision Evaluator component interface and configuration
Evaluation Notice: Low Context Precision scores indicate that your retrieval system is surfacing irrelevant information, which can lead to distraction, confusion, or injection of incorrect information into responses.
Component Inputs
- Retrieved Contexts: The collection of retrieved passages or documents used to generate the response
Example: ["Electric vehicles produce zero direct emissions, which improves air quality.", "The history of automobiles dates back to the late 19th century when the first gasoline cars were invented."]
- Expected Contexts: The reference or expected contexts that are considered relevant
Example: ["Electric vehicles produce zero direct emissions, which improves air quality.", "EVs have lower operating costs compared to conventional vehicles."]
- Distance Measure: The method used to calculate the relevance of retrieved contexts
Example: "Semantic similarity"
Component Outputs
- Evaluation Result: Qualitative assessment of the relevance of each retrieved context
Example: "Context #1 is highly relevant to the question about electric vehicle benefits. Context #2 about automobile history is tangential and less relevant."
Score Interpretation
High Context Precision (0.7-1.0)
Most or all of the retrieved contexts are relevant to the query
Example Score: 0.95
This indicates excellent retrieval precision with minimal irrelevant information
Moderate Context Precision (0.3-0.7)
Some retrieved contexts are relevant, but others contain off-topic or tangential information
Example Score: 0.50
This indicates a mix of relevant and irrelevant contexts
Low Context Precision (0.0-0.3)
Most retrieved contexts are irrelevant to the query
Example Score: 0.15
This indicates poor retrieval precision with mostly irrelevant information
Implementation Example
from ragas.metrics import ContextPrecision
# Create the metric
context_precision = ContextPrecision()
# Use in evaluation
from datasets import Dataset
from ragas import evaluate
eval_dataset = Dataset.from_dict({
"question": ["What are the benefits of electric vehicles?"],
"contexts": [
["Electric vehicles produce zero direct emissions,
which improves air quality.",
"The history of automobiles dates back to the late
19th century when the first gasoline cars were invented."]
]
})
result = evaluate(
eval_dataset,
metrics=[context_precision]
)
print(result)
Use Cases
- Retrieval Efficiency: Optimize retrieval systems to avoid wasting computational resources on irrelevant contexts
- Vector Search Tuning: Fine-tune vector search parameters (like similarity thresholds) to improve precision
- Query Parsing Improvement: Refine query parsing methods to better capture user intent
- Distraction Reduction: Prevent models from being distracted by irrelevant information that could lead to hallucinations
- Document Preprocessing: Evaluate different document chunking and preprocessing approaches
Best Practices
- Balance Context Precision with Context Recall - improving precision may reduce recall
- Consider implementing a pre-filtering step to remove obviously irrelevant documents before more expensive processing
- Track precision metrics over time to detect drift in retrieval effectiveness
- Use domain-specific knowledge to define context relevance for specialized applications
- Combine with query-specific filters to improve precision for particular types of questions