In this section, we will cover how to filter and sort data using SQL in BigQuery. These are fundamental skills that will help you retrieve specific subsets of data and organize them in a meaningful way.

Filtering Data

Filtering data allows you to retrieve only the rows that meet certain conditions. This is done using the WHERE clause in SQL.

Basic Filtering

The WHERE clause is used to filter records based on specified conditions. Here’s a simple example:

SELECT *
FROM `my_dataset.my_table`
WHERE age > 30;

In this example:

  • SELECT * retrieves all columns.
  • FROM my_dataset.my_table specifies the table.
  • WHERE age > 30 filters the rows to include only those where the age column is greater than 30.

Common Operators

Here are some common operators you can use in the WHERE clause:

Operator Description Example
= Equal to age = 30
<> Not equal to age <> 30
> Greater than age > 30
< Less than age < 30
>= Greater than or equal to age >= 30
<= Less than or equal to age <= 30
BETWEEN Between a range age BETWEEN 20 AND 30
LIKE Pattern matching name LIKE 'J%'
IN Matches any in a list age IN (20, 30, 40)

Combining Conditions

You can combine multiple conditions using AND and OR:

SELECT *
FROM `my_dataset.my_table`
WHERE age > 30 AND city = 'New York';

In this example, only rows where age is greater than 30 and city is 'New York' will be retrieved.

Sorting Data

Sorting data allows you to order the results based on one or more columns. This is done using the ORDER BY clause.

Basic Sorting

The ORDER BY clause is used to sort the result set by one or more columns:

SELECT *
FROM `my_dataset.my_table`
ORDER BY age;

In this example:

  • ORDER BY age sorts the rows by the age column in ascending order.

Sorting in Descending Order

To sort the results in descending order, use the DESC keyword:

SELECT *
FROM `my_dataset.my_table`
ORDER BY age DESC;

Sorting by Multiple Columns

You can sort by multiple columns by separating them with commas:

SELECT *
FROM `my_dataset.my_table`
ORDER BY age DESC, name ASC;

In this example:

  • The rows are first sorted by age in descending order.
  • If there are rows with the same age, they are then sorted by name in ascending order.

Practical Examples

Example 1: Filtering and Sorting

Retrieve all employees older than 25 and sort them by their last name:

SELECT *
FROM `company.employees`
WHERE age > 25
ORDER BY last_name;

Example 2: Combining Conditions

Retrieve all products that are either in the 'Electronics' category or have a price greater than $100, and sort them by price in descending order:

SELECT *
FROM `store.products`
WHERE category = 'Electronics' OR price > 100
ORDER BY price DESC;

Exercises

Exercise 1: Basic Filtering

Retrieve all rows from the sales table where the amount is greater than 500.

SELECT *
FROM `sales`
WHERE amount > 500;

Exercise 2: Combining Conditions

Retrieve all rows from the customers table where the city is 'San Francisco' and the age is less than 40.

SELECT *
FROM `customers`
WHERE city = 'San Francisco' AND age < 40;

Exercise 3: Sorting

Retrieve all rows from the orders table and sort them by order_date in ascending order.

SELECT *
FROM `orders`
ORDER BY order_date ASC;

Exercise 4: Filtering and Sorting

Retrieve all rows from the employees table where the department is 'HR' and sort them by hire_date in descending order.

SELECT *
FROM `employees`
WHERE department = 'HR'
ORDER BY hire_date DESC;

Summary

In this section, you learned how to filter and sort data using SQL in BigQuery. You now know how to:

  • Use the WHERE clause to filter data based on conditions.
  • Combine multiple conditions using AND and OR.
  • Use the ORDER BY clause to sort data in ascending or descending order.
  • Sort data by multiple columns.

These skills are essential for querying and analyzing data effectively. In the next section, we will cover how to aggregate data to gain deeper insights.

© Copyright 2024. All rights reserved