Introduction
Correlated subqueries are a powerful feature in SQL that allow you to perform more complex queries by referencing columns from the outer query within the subquery. Unlike regular subqueries, which are executed once, correlated subqueries are executed once for each row processed by the outer query.
Key Concepts
- Subquery: A query nested inside another query.
- Correlated Subquery: A subquery that references columns from the outer query.
- Outer Query: The main query that contains the subquery.
Syntax
The basic syntax for a correlated subquery is as follows:
SELECT column1, column2, ... FROM table1 AS t1 WHERE columnX = (SELECT aggregate_function(columnY) FROM table2 AS t2 WHERE t1.columnZ = t2.columnZ);
Practical Example
Let's consider two tables: employees
and departments
.
Table: employees
employee_id | name | department_id | salary |
---|---|---|---|
1 | Alice | 1 | 50000 |
2 | Bob | 2 | 60000 |
3 | Charlie | 1 | 55000 |
4 | David | 3 | 70000 |
5 | Eve | 2 | 65000 |
Table: departments
department_id | department_name |
---|---|
1 | HR |
2 | IT |
3 | Finance |
Example Query
Find the employees whose salary is greater than the average salary in their department.
SELECT name, salary FROM employees AS e1 WHERE salary > (SELECT AVG(salary) FROM employees AS e2 WHERE e1.department_id = e2.department_id);
Explanation
-
Outer Query:
SELECT name, salary FROM employees AS e1 WHERE ...
- This query selects the
name
andsalary
of employees from theemployees
table.
- This query selects the
-
Correlated Subquery:
(SELECT AVG(salary) FROM employees AS e2 WHERE e1.department_id = e2.department_id)
- This subquery calculates the average salary for the department of the current employee (
e1
). - The subquery is executed for each row in the outer query, using the
department_id
of the current row (e1.department_id
).
- This subquery calculates the average salary for the department of the current employee (
Exercises
Exercise 1
Problem: Find the employees who earn more than the average salary in their department.
Solution:
SELECT name, salary FROM employees AS e1 WHERE salary > (SELECT AVG(salary) FROM employees AS e2 WHERE e1.department_id = e2.department_id);
Exercise 2
Problem: List the departments where the average salary is greater than $60,000.
Solution:
SELECT department_name FROM departments AS d WHERE (SELECT AVG(salary) FROM employees AS e WHERE e.department_id = d.department_id) > 60000;
Exercise 3
Problem: Find the employees who have the highest salary in their department.
Solution:
SELECT name, salary FROM employees AS e1 WHERE salary = (SELECT MAX(salary) FROM employees AS e2 WHERE e1.department_id = e2.department_id);
Common Mistakes
- Incorrect Column References: Ensure that the columns referenced in the subquery are correctly correlated with the outer query.
- Performance Issues: Correlated subqueries can be slow because they are executed once for each row in the outer query. Consider using JOINs or other optimization techniques if performance is an issue.
Conclusion
Correlated subqueries are a powerful tool for performing complex queries that depend on the data from the outer query. By understanding how to use them effectively, you can solve a wide range of data retrieval problems in SQL. Practice with the provided exercises to reinforce your understanding and become proficient in using correlated subqueries.
SQL Course
Module 1: Introduction to SQL
Module 2: Basic SQL Queries
Module 3: Working with Multiple Tables
Module 4: Advanced Data Filtering
- Using LIKE for Pattern Matching
- IN and BETWEEN Operators
- NULL Values and IS NULL
- Aggregating Data with GROUP BY
- HAVING Clause
Module 5: Data Manipulation
Module 6: Advanced SQL Functions
Module 7: Subqueries and Nested Queries
- Introduction to Subqueries
- Correlated Subqueries
- EXISTS and NOT EXISTS
- Using Subqueries in SELECT, FROM, and WHERE Clauses
Module 8: Indexes and Performance Optimization
- Understanding Indexes
- Creating and Managing Indexes
- Query Optimization Techniques
- Analyzing Query Performance