Documentation

SQL Query Equivalence Evaluator

The SQL Query Equivalence evaluator assesses whether two SQL queries are semantically equivalent despite potential syntactic differences. It determines if a generated SQL query would produce the same results as a reference query when executed against the same database.

SQL Query Equivalence Component

SQL Query Equivalence component interface and configuration

Evaluation Notice: SQL queries may be syntactically different but semantically equivalent. This evaluator looks beyond surface-level syntax to determine if two queries would produce the same result set when executed.

Component Inputs

  • Generated SQL: The SQL query generated by the RAG system to be evaluated

    Example: "SELECT name, salary FROM employees WHERE department = 'Sales' AND salary > 50000;"

  • Reference SQL: The ground truth or expected SQL query for comparison

    Example: "SELECT name, salary FROM employees WHERE department = 'Sales' AND salary > 50000 ORDER BY salary DESC;"

  • Table Schema / Context: Database schema information to help evaluate query validity and equivalence

    Example: { "employees": ["id", "name", "department", "salary", "hire_date"] }

Component Outputs

  • Evaluation Result: Qualitative explanation of the equivalence assessment, highlighting differences between the queries

    Example: "The generated query is semantically equivalent to the reference query in terms of data selection, but differs in ordering. Both queries select the same rows, but the reference query sorts results by salary in descending order."

Score Interpretation

High Equivalence (0.7-1.0)

Queries are semantically equivalent and would produce the same results despite potential syntactic differences

Example Score: 0.95 This indicates that the queries would return essentially the same data with possibly minor differences like column order or sorting

Partial Equivalence (0.3-0.7)

Queries have substantial overlap but differ in some aspects like sorting, column selection, or minor filtering

Example Score: 0.50 This indicates partial equivalence where the queries retrieve similar but not identical result sets

Low Equivalence (0.0-0.3)

Significant semantic differences between the queries that would produce different results

Example Score: 0.15 This indicates that the queries would return substantially different results

Implementation Example

from ragas.metrics import SQLQueryEquivalence # Create the metric sql_equivalence = SQLQueryEquivalence() # Use in evaluation from datasets import Dataset from ragas import evaluate eval_dataset = Dataset.from_dict({ "question": ["Show me all employees in the Sales department with a salary above 50000"], "contexts": [["The employees table has columns: id, name, department, salary, hire_date."]], "answer": ["SELECT name, salary FROM employees WHERE department = 'Sales' AND salary > 50000;"], "reference_answer": ["SELECT name, salary FROM employees WHERE department = 'Sales' AND salary > 50000 ORDER BY salary DESC;"] }) result = evaluate( eval_dataset, metrics=[sql_equivalence] ) print(result)

Use Cases

  • Text-to-SQL Evaluation: Assess natural language interfaces that translate user queries to SQL statements
  • SQL Rewriting: Evaluate systems that optimize or rewrite SQL queries while preserving semantics
  • Database Tools: Test business intelligence tools that generate SQL from user interactions or natural language
  • SQL Learning Systems: Assess educational platforms teaching SQL by evaluating student solutions against reference answers
  • Query Generation: Evaluate LLM capabilities to generate SQL queries based on natural language descriptions

Best Practices

  • Provide comprehensive schema information to enable accurate query analysis
  • Treat ordering and formatting differences differently than semantic differences
  • Consider database-specific SQL dialects and syntax variations
  • Review low equivalence scores manually to identify specific discrepancies
  • Use in combination with query execution tests when possible to verify actual result equivalence