What is Normalization in Database? || 1NF, 2NF, 3NF with examples

What is "Normalization"?

Normalization is the process of minimizing data redundancy and enhancing data integrity in the table. It is a multi-step process that sets the data into tabular form and removes the duplicated data from the relational table.

The goal of normalization in databases is to make the database more flexible by reducing redundancy and dependency.

Need of Normalization

The table is a basic building block in the database design process. So, the structure of the table is the great interest in relational database design. Normalization also helps organize the data and information in the database. So that the following things can be ensured in such a table:

It divides the larger table into smaller tables and links them using the relationship.
It removes data inconsistency
To provides a faster data sorting and indexing.
It simplifies the table structure
It avoids the loss of data and information.
It improves the whole system's performance.
It removes the insertion, deletion, and updating anomalies.
Protects from unauthorized users and secures the data in the database

Advantages of Database Normalization

Here are several advantages of database normalization in various contexts:

In a business setting, normalization can improve the efficiency and accuracy of data entry and analysis.
By organizing data in a more structured way, it is easier to extract meaningful insights and make data-driven decisions.
In a software development context, normalization can help to improve the maintainability and scalability of the database.
By minimizing data dependencies, it is easier to make changes to the database without affecting other parts of the system.
In a data warehousing context, normalization can help to improve the performance of the database by reducing the amount of data that needs to be processed. This can make it easier to extract insights from large and complex datasets.

Overall, normalization can help to improve the structure and organization of a database, which can make it easier to maintain and use over time.

Disadvantages of Database Normalization

There are a few potential disadvantages to database normalization:

Increased complexity: Normalizing a database can make it more complex as the data is broken down into smaller tables and relationships are established between them. This can make it more difficult to understand and use the database, especially for users who are not familiar with the normalized structure.
Performance issues: In some cases, retrieving data from a normalized database can be slower than from a non-normalized database, especially if the database is large and complex. This is because more tables and relationships need to be accessed in order to retrieve the desired data.
Loss of some data: In the process of normalization, some data may be lost or discarded. This can be a disadvantage if the data is important or if it is difficult to recreate.
Increased storage requirements: Normalization can also result in an increase in the amount of storage space required to store the data as the data is spread out over multiple tables.

Overall, while normalization can bring many benefits, it is important to carefully consider the trade-offs involved before deciding to normalize a database.

Keys

A key is an attribute that is used to identify a particular record in a database. A key may be composed of more than one attribute. Any attribute that is part of the key is also known as a key attribute.

Types of keys:

Candidate key: All attribute combinations inside a relation that can be used to uniquely identify a required record are candidate keys. A relationship can have multiple candidate keys.
Primary key: A primary key is a candidate key chosen by a database designer to identify an entity from the entity set. It must be unique.
Super key: A super key is a set of one or more attributes that, taken collectively, allow us to uniquely identify an entity in the entity set.
Alternate key: In the case of two or more candidate keys, only one of them serves as the primary key. An alternate key is the candidate key, which is not used as the primary key.
Foreign key: A foreign key is used to represent the relationship between two tables. A foreign key is a key attribute whose value is derived from the primary key of another table.

Types of Database Normalization

First Normal Form (1NF)

First Normal Form (1NF) is a database design technique that organizes tables in a manner that reduces redundancy and dependency. It is the first step in the process of normalizing a database.

A single cell must not hold more than one value (atomicity).
There should not be any repeating groups of an attribute.
It eliminates the repeating groups of attributes in an entity.

If any characteristics is repeated again and again in the same table or rows, then such attributes are removed either into a separate table or decomposed into different rows.

For example, there is a student table with un-normalized and 1NF values shown below:

Second Normal Form (2NF)

In this normal form, the table should be in 1NF, and each attribute is functionally dependent on the entire primary key. The main purpose of 2NF is to eliminate partial dependencies. In other words, each attribute of the table must depend on the whole key, not just a part of it.

For example, in the table above, the primary key is Roll_No. The dependencies of other attributes on this primary key. So the above table decomposes into the following tables:

Third Normal Form (3NF)

When a table is in 2NF, it eliminates repeating groups and redundancy, but it does not eliminate transitive partial dependency. This means a non-prime attribute (an attribute that is not part of the candidate’s key) is dependent on another non-prime attribute. This is what the third normal form (3NF) eliminates.

So, for a table to be in 3NF, it must:

It should be in 2NF.
There is no transitive partial dependency.

A transitive dependency is the one in which, among 3 attributes A, B and C, If A→B, B→C then C→A. A relation in 3NF should not contain transitive dependency as shown above.

The above table can be presented in 3NF as follows:

GeekMono