📘 Chapter 4 :- Normalization


🧠 What is Normalization in DBMS?




Normalization is a process used in Database Management Systems (DBMS) to:

  • 🔹 Organize data efficiently
  • 🔹 Eliminate data redundancy (repeating or duplicate data)
  • 🔹 Ensure data integrity by structuring the database properly

🎯 Purpose of Normalization

The main goal is to divide a large table into smaller related tables and link them using relationships (foreign keys) so that:

  • Data is stored only once (no duplication)
  • Dependencies are logical and accurate
  • Data anomalies (problems) are reduced

📦 Example Before Normalization

StudentID Name Course1 Course2
101 Alice Math English
102 Bob Science NULL

Problems:

  • Redundant columns (Course1, Course2, etc.)
  • Adding more courses = more columns
  • Hard to manage/update

✅ After Normalization

Student Table:

StudentID Name
101 Alice
102 Bob

Course Table:

StudentID Course
101 Math
101 English
102 Science

Now, the data is atomic, clean, and flexible. You can add any number of courses without changing the table structure!

📌 Benefits of Normalization

  • ✅ Removes duplicate data
  • ✅ Reduces update/delete anomalies
  • ✅ Improves data consistency
  • ✅ Makes database easier to maintain
  • ✅ Efficient storage and better performance

🔢 Steps of Normalization (Normal Forms)

🔹 1️⃣ First Normal Form (1NF)

✅ Rule:
Each column should contain atomic (indivisible) values. No repeating groups or arrays.

❌ Example (Unnormalized Table):

StudentID Name Subjects
101 Alice Math, Science

✅ In 1NF (Subjects separated into rows):

StudentID Name Subject
101 Alice Math
101 Alice Science

🔹 2️⃣ Second Normal Form (2NF)

✅ Rule:
Must be in 1NF. All non-key attributes must depend entirely on the primary key (no partial dependency).

❌ Example (Partial Dependency):

StudentID Subject StudentName SubjectTeacher
101 Math Alice Mr. John

✅ Fix (Split into two tables):

Student Table:
StudentID StudentName
101 Alice
Subject Table:
Subject SubjectTeacher
Math Mr. John

🔹 3️⃣ Third Normal Form (3NF)

✅ Rule:
Must be in 2NF. No transitive dependency (non-key attributes should not depend on other non-key attributes).

❌ Example (Transitive Dependency):

StudentID Name Department HOD
101 Alice Science Dr. Smith

🔴 HOD depends on Department, not directly on StudentID.

✅ After Applying 3NF:

Student Table:
StudentID Name Department
101 Alice Science
Department Table:
Department HOD
Science Dr. Smith

✔ Transitive dependency removed.

🔹 4️⃣ Boyce-Codd Normal Form (BCNF)

✅ Rule:
Must be in 3NF. For every functional dependency A → B, A should be a super key.

🔍 BCNF is a stricter version of 3NF. It fixes cases where 3NF is satisfied but anomalies still exist.

🔹 5️⃣ Fourth Normal Form (4NF)

✅ Rule:
Must be in BCNF. No multi-valued dependencies.

❌ Example (Multi-valued Dependency):

Student Hobby Language
Alice Painting English
Alice Painting French
Alice Dancing English
Alice Dancing French

🔴 Hobby and Language are independent multivalued facts.

✅ After Applying 4NF:

Student_Hobby Table:
Student Hobby
Alice Painting
Alice Dancing
Student_Language Table:
Student Language
Alice English
Alice French

✔ Multi-valued dependency removed.

🔹 6️⃣ Fifth Normal Form (5NF)

✅ Rule:
Must be in 4NF. No join dependencies (lossless join of tables should be possible).

📚 5NF is rarely needed, but useful in highly complex data models involving many-to-many-to-many relationships.

📚 Summary Table of Normal Forms

Normal Form ✅ Rule 🧹 Removes
1NF Atomic values, no repeating groups Repeating groups
2NF Full functional dependency Partial dependency
3NF No transitive dependency Transitive dependency
BCNF Every determinant must be a super key Remaining anomalies in 3NF
4NF No multi-valued dependency Multivalued dependencies
5NF No join dependency Complex join anomalies

Post a Comment

0 Comments