Database Normalization
The primary objective of database normalization is to minimize the duplication of data within a table by
- seperating the unrelated data into individual tables
- encouraging the use of relationships to connect different tables
Pros
- Reduces Data Redundancy: By breaking down data into smaller related tables, it helps to conserve storage space and reduces the chances of inconsistencies or anomalies in the data.
- Simplifies Maintenance: Maintenance tasks such as adding, modifying, or deleting data become more straightforward, as changes typically need to be made in fewer places, reducing the likelihood of errors.
- Promotes Scalability: Normalized databases are generally more adaptable to changes and expansions in the data model, better suited for scaling as the application requirements evolve over time.
Cons
- Complexity in Joins: the normalization of databases often require multiple joins to retrieve data from related tables, which can complicate the query and increase the risk of errors, particularly for developers who are not proficient in SQL or database design principles.
- Increased storage load: Due to the increased data flexibility, larger storage space may be required as the data will be stored in multiple tables.
Levels of Normalization
1NF
- Remove individual data tables within duplicate groups.
- Create different data tables for each set of associated data.
- Use the primary index key to identify each associated set of data.
2NF (Remove Partial Dependency)
- All data in the data table must be completely dependent on the keys (primary key and candidate key) of the data table.
- If there are data that are only related to part of a key, they must be separated into another data table.
3NF (Remove Transitive Dependency)
- All data in non-key value fields must only be related to the primary key.
- Non-key value fields cannot be related to each other.