Normalization is a database design technique that organizes tables in a manner that reduces redundancy and dependency of data. It divides large tables into smaller ones and links them using relationships. The main goal of normalization is to eliminate redundant data and ensure data dependencies make sense to improve the integrity and efficiency of the database.
Key Concepts of Normalization
- Redundancy: The unnecessary repetition of data within a database.
- Dependency: The relationship between columns in a table.
- Normal Forms: The stages of normalization, each with specific rules to follow.
Normal Forms
Normalization is typically divided into several normal forms (NF). Each normal form has specific requirements that must be met:
First Normal Form (1NF)
- Rule: Each table cell should contain a single value, and each record needs to be unique.
- Objective: Eliminate repeating groups in individual tables.
Example:
-- Unnormalized table CREATE TABLE Orders ( OrderID INT, CustomerName VARCHAR(100), Product1 VARCHAR(100), Product2 VARCHAR(100), Product3 VARCHAR(100) ); -- Normalized table in 1NF CREATE TABLE Orders ( OrderID INT, CustomerName VARCHAR(100) ); CREATE TABLE OrderDetails ( OrderDetailID INT, OrderID INT, Product VARCHAR(100) );
Second Normal Form (2NF)
- Rule: Meet all the requirements of the first normal form and remove subsets of data that apply to multiple rows of a table and place them in separate tables. Create relationships between these new tables and their predecessors through foreign keys.
- Objective: Eliminate partial dependency (where a non-prime attribute is functionally dependent on part of a candidate key).
Example:
-- Table in 1NF CREATE TABLE OrderDetails ( OrderDetailID INT, OrderID INT, Product VARCHAR(100), ProductPrice DECIMAL ); -- Normalized table in 2NF CREATE TABLE Products ( ProductID INT, ProductName VARCHAR(100), ProductPrice DECIMAL ); CREATE TABLE OrderDetails ( OrderDetailID INT, OrderID INT, ProductID INT );
Third Normal Form (3NF)
- Rule: Meet all the requirements of the second normal form and remove columns that are not dependent upon the primary key.
- Objective: Eliminate transitive dependency (where a non-prime attribute is dependent on another non-prime attribute).
Example:
-- Table in 2NF CREATE TABLE Products ( ProductID INT, ProductName VARCHAR(100), ProductPrice DECIMAL, SupplierName VARCHAR(100), SupplierAddress VARCHAR(255) ); -- Normalized table in 3NF CREATE TABLE Suppliers ( SupplierID INT, SupplierName VARCHAR(100), SupplierAddress VARCHAR(255) ); CREATE TABLE Products ( ProductID INT, ProductName VARCHAR(100), ProductPrice DECIMAL, SupplierID INT );
Boyce-Codd Normal Form (BCNF)
- Rule: Meet all the requirements of the third normal form and for any dependency A → B, A should be a super key.
- Objective: Handle anomalies that 3NF does not address.
Example:
-- Table in 3NF CREATE TABLE Courses ( CourseID INT, CourseName VARCHAR(100), InstructorID INT, InstructorName VARCHAR(100) ); -- Normalized table in BCNF CREATE TABLE Instructors ( InstructorID INT, InstructorName VARCHAR(100) ); CREATE TABLE Courses ( CourseID INT, CourseName VARCHAR(100), InstructorID INT );
Practical Exercises
Exercise 1: Normalize the Following Table to 1NF
Table: Students | StudentID | Name | Courses | |-----------|------------|--------------------| | 1 | John Smith | Math, Science | | 2 | Jane Doe | English, History |
Solution:
-- Normalized table in 1NF CREATE TABLE Students ( StudentID INT, Name VARCHAR(100) ); CREATE TABLE StudentCourses ( StudentCourseID INT, StudentID INT, CourseName VARCHAR(100) );
Exercise 2: Normalize the Following Table to 2NF
Table: Orders | OrderID | CustomerName | ProductName | ProductPrice | |---------|--------------|-------------|--------------| | 1 | Alice | Laptop | 1000 | | 2 | Bob | Phone | 500 |
Solution:
-- Normalized table in 2NF CREATE TABLE Customers ( CustomerID INT, CustomerName VARCHAR(100) ); CREATE TABLE Products ( ProductID INT, ProductName VARCHAR(100), ProductPrice DECIMAL ); CREATE TABLE Orders ( OrderID INT, CustomerID INT, ProductID INT );
Exercise 3: Normalize the Following Table to 3NF
Table: Employees | EmployeeID | EmployeeName | Department | DepartmentLocation | |------------|--------------|------------|--------------------| | 1 | John | HR | New York | | 2 | Jane | IT | San Francisco |
Solution:
-- Normalized table in 3NF CREATE TABLE Departments ( DepartmentID INT, DepartmentName VARCHAR(100), DepartmentLocation VARCHAR(100) ); CREATE TABLE Employees ( EmployeeID INT, EmployeeName VARCHAR(100), DepartmentID INT );
Summary
Normalization is a crucial process in database design that helps in organizing data efficiently, reducing redundancy, and ensuring data integrity. By following the rules of different normal forms, you can create a well-structured database that is easy to maintain and query. Understanding and applying normalization principles will significantly improve the performance and reliability of your PostgreSQL databases.
PostgreSQL Course
Module 1: Introduction to PostgreSQL
Module 2: Basic SQL Operations
Module 3: Advanced SQL Queries
Module 4: Database Design and Normalization
Module 5: Advanced PostgreSQL Features
Module 6: Performance Tuning and Optimization
Module 7: Security and User Management
Module 8: Working with JSON and NoSQL Features
Module 9: Extensions and Advanced Tools
- PostGIS for Geospatial Data
- Full-Text Search
- Foreign Data Wrappers
- PL/pgSQL and Other Procedural Languages