Documentation

Context Precision Evaluator

The Context Precision Evaluator measures how relevant the retrieved contexts are to the user's query. It helps identify cases where irrelevant or tangential information is included in the retrieved contexts.

Context Precision Evaluator Component

Context Precision Evaluator component interface and configuration

Evaluation Notice: Low Context Precision scores indicate that your retrieval system is surfacing irrelevant information, which can lead to distraction, confusion, or injection of incorrect information into responses.

Component Inputs

  • Retrieved Contexts: The collection of retrieved passages or documents used to generate the response

    Example: ["Electric vehicles produce zero direct emissions, which improves air quality.", "The history of automobiles dates back to the late 19th century when the first gasoline cars were invented."]

  • Expected Contexts: The reference or expected contexts that are considered relevant

    Example: ["Electric vehicles produce zero direct emissions, which improves air quality.", "EVs have lower operating costs compared to conventional vehicles."]

  • Distance Measure: The method used to calculate the relevance of retrieved contexts

    Example: "Semantic similarity"

Component Outputs

  • Evaluation Result: Qualitative assessment of the relevance of each retrieved context

    Example: "Context #1 is highly relevant to the question about electric vehicle benefits. Context #2 about automobile history is tangential and less relevant."

Score Interpretation

High Context Precision (0.7-1.0)

Most or all of the retrieved contexts are relevant to the query

Example Score: 0.95 This indicates excellent retrieval precision with minimal irrelevant information

Moderate Context Precision (0.3-0.7)

Some retrieved contexts are relevant, but others contain off-topic or tangential information

Example Score: 0.50 This indicates a mix of relevant and irrelevant contexts

Low Context Precision (0.0-0.3)

Most retrieved contexts are irrelevant to the query

Example Score: 0.15 This indicates poor retrieval precision with mostly irrelevant information

Implementation Example

from ragas.metrics import ContextPrecision # Create the metric context_precision = ContextPrecision() # Use in evaluation from datasets import Dataset from ragas import evaluate eval_dataset = Dataset.from_dict({ "question": ["What are the benefits of electric vehicles?"], "contexts": [ ["Electric vehicles produce zero direct emissions, which improves air quality.", "The history of automobiles dates back to the late 19th century when the first gasoline cars were invented."] ] }) result = evaluate( eval_dataset, metrics=[context_precision] ) print(result)

Use Cases

  • Retrieval Efficiency: Optimize retrieval systems to avoid wasting computational resources on irrelevant contexts
  • Vector Search Tuning: Fine-tune vector search parameters (like similarity thresholds) to improve precision
  • Query Parsing Improvement: Refine query parsing methods to better capture user intent
  • Distraction Reduction: Prevent models from being distracted by irrelevant information that could lead to hallucinations
  • Document Preprocessing: Evaluate different document chunking and preprocessing approaches

Best Practices

  • Balance Context Precision with Context Recall - improving precision may reduce recall
  • Consider implementing a pre-filtering step to remove obviously irrelevant documents before more expensive processing
  • Track precision metrics over time to detect drift in retrieval effectiveness
  • Use domain-specific knowledge to define context relevance for specialized applications
  • Combine with query-specific filters to improve precision for particular types of questions