In data modeling and database design, denormalization generally refers to the practise of dimensional modeling while normalization refers to Entity relationship modeling.
Normalization is a way of organizing data so that redundancy is eliminated . This involves deduplication and ensuring that data is atomic. Each table represents information about one single entity making it independent of other entities. Normalization aims to have each table represent a unique definition. Each table has a unique key. All the fields and attributes within the table depend on this key. A table is generally considered normalized if it satisfies the 3NF .
A key benefit of normalized forms is that when changes or additions are being made to data , the change is only required to be made to one entity/table
Denormalization is the deliberate violation of the 1st, 2nd, or third normal forms. This process leads to the introduction of Foreign keys in tables. This occurs when a Fact table is created in dimensional modeling.
A fact table contains measures about business events while the dimensions describe those events. A fact table therefore contains many keys. These are foreign keys that link it to the various dimensions of the business. This facilitates quick and efficient analytics.
Dimensional modeling organizes data into Facts and Dimensions. This organization of data allows business users understand the data better and offers a simple view of the business data.
Normalizing data is considered essential for storing business transactions and operations.
Why then do we normalize data only to denormalize data when modeling for analytics and reporting anyway?
We do so because Normalization is vital for transaction processing. It allows updates to be made in one entity quickly. It removes redundancies and helps with referential integrity. Denormalization improves analytical processing by making reporting efficient and quicker. It reduces the amount of tables we have to join in order to develop the required output.
While denormalization is essential for efficient business intelligence, analytics and reporting , operational data stores should always be normalized to ensure referential integrity.


Leave a comment