Third Normal Form

Learn third normal form with practical examples, diagrams, and best practices. Covers database-design, normalization, database-normalization development techniques with visual explanations.

Understanding Third Normal Form (3NF) in Database Design

Hero image for Third Normal Form

Explore the principles of Third Normal Form (3NF) to design robust and efficient relational databases, minimizing data redundancy and improving data integrity.

Database normalization is a crucial process in relational database design, aimed at reducing data redundancy and improving data integrity. Among the various normal forms, Third Normal Form (3NF) is widely considered a good balance between normalization and performance for most transactional databases. This article will delve into what 3NF entails, why it's important, and how to achieve it.

What is Third Normal Form (3NF)?

Third Normal Form (3NF) builds upon the principles of First Normal Form (1NF) and Second Normal Form (2NF). A table is in 3NF if it meets the following criteria:

  1. It is in 2NF.
  2. All non-key attributes are non-transitively dependent on the primary key. This means that no non-key attribute is dependent on another non-key attribute. In simpler terms, every non-key attribute must provide a fact about the key, the whole key, and nothing but the key (and not about another non-key attribute).

Why is 3NF Important?

Achieving 3NF offers several significant benefits for database design:

  • Reduced Data Redundancy: By eliminating transitive dependencies, 3NF ensures that data is stored in only one place, reducing the likelihood of inconsistencies.
  • Improved Data Integrity: When data is stored once, updates are simpler and less prone to errors. This maintains the accuracy and reliability of the data.
  • Easier Maintenance: Changes to data only need to be made in one location, simplifying database maintenance and reducing the risk of anomalies.
  • Better Query Performance (Often): While over-normalization can sometimes lead to more complex joins, a well-designed 3NF database often performs better due to smaller table sizes and more efficient data retrieval for specific entities.

Achieving 3NF: A Practical Example

Let's consider a Students table that is currently in 2NF but not 3NF. We'll walk through the process of normalizing it to 3NF.

Initial Table (in 2NF, but not 3NF):

Students Table:

StudentID (PK)StudentNameCourseID (PK)CourseNameInstructorNameInstructorDept
101AliceC101Database Fund.Dr. SmithComputer Science
101AliceM202Calculus IDr. JonesMathematics
102BobC101Database Fund.Dr. SmithComputer Science

In this table, InstructorName and InstructorDept are dependent on CourseName, which is a non-key attribute. This is a transitive dependency: (StudentID, CourseID) -> CourseName -> (InstructorName, InstructorDept).

erDiagram
    "Students (Before 3NF)" {
        int StudentID PK
        varchar StudentName
        int CourseID PK
        varchar CourseName
        varchar InstructorName
        varchar InstructorDept
    }
    "Students (Before 3NF)" ||--o{ "CourseName" : "depends on"
    "CourseName" ||--o{ "InstructorName, InstructorDept" : "determines"

ER Diagram of the 'Students' table before 3NF, showing transitive dependency.

To achieve 3NF, we need to remove the transitive dependency by creating a new table for the Course information.

Step 1: Identify the Transitive Dependency

InstructorName and InstructorDept are dependent on CourseName, not directly on the composite primary key (StudentID, CourseID).

Step 2: Create a New Table for the Dependent Attributes

Extract CourseName, InstructorName, and InstructorDept into a new Courses table, with CourseID as its primary key. The CourseID will also remain in the Students table as a foreign key.

Normalized Tables (in 3NF):

Enrollments Table:

StudentID (PK, FK)CourseID (PK, FK)StudentName
101C101Alice
101M202Alice
102C101Bob

Courses Table:

CourseID (PK)CourseNameInstructorNameInstructorDept
C101Database Fund.Dr. SmithComputer Science
M202Calculus IDr. JonesMathematics

Now, both tables are in 3NF. In the Enrollments table, StudentName is directly dependent on the composite primary key (StudentID, CourseID). In the Courses table, CourseName, InstructorName, and InstructorDept are all directly dependent on the primary key CourseID.

erDiagram
    "Enrollments" {
        int StudentID PK, FK
        int CourseID PK, FK
        varchar StudentName
    }
    "Courses" {
        int CourseID PK
        varchar CourseName
        varchar InstructorName
        varchar InstructorDept
    }
    "Enrollments" }|--|| "Students" : "enrolls"
    "Enrollments" }|--|| "Courses" : "takes"

ER Diagram of the normalized tables in 3NF, showing direct dependencies.

Here's how you might create these tables in SQL:

CREATE TABLE Courses (
    CourseID VARCHAR(10) PRIMARY KEY,
    CourseName VARCHAR(100) NOT NULL,
    InstructorName VARCHAR(100),
    InstructorDept VARCHAR(100)
);

CREATE TABLE Enrollments (
    StudentID INT NOT NULL,
    CourseID VARCHAR(10) NOT NULL,
    StudentName VARCHAR(100),
    PRIMARY KEY (StudentID, CourseID),
    FOREIGN KEY (CourseID) REFERENCES Courses(CourseID)
);

SQL DDL for creating the 3NF normalized tables.