Data Cleaning
Since a data warehouse is used for decision making, it is
important that the data in the warehouse be correct. However,
since large volumes of data from multiple sources are
involved, there is a high probability of errors and anomalies
in the data.. Therefore, tools that help to detect data
anomalies and correct them can have a high payoff. Some
examples where data cleaning becomes necessary are:
inconsistent field lengths, inconsistent descriptions,
inconsistent value assignments, missing entries and violation
of integrity constraints. Not surprisingly, optional fields in
data entry forms are significant sources of inconsistent data.