Data Quality Dimensions
DQ dimensions are a common way to identify and cluster DQ checks. There are many definitions, and the number of dimensions varies considerably: You might find 16, or even more dimensions. From a practical perspective, it is less confusing to start with a few dimensions and find a general understanding of them among your users.
- Completeness: Is all the data required available and accessible? Are all sources needed available and loaded? Was data lost between stages?
- Consistency: Is there erroneous/conflicting/inconsistent data? For example, the termination date of a contract in a “Terminated” state must contain a valid date higher than or equal to the start date of the contract.
- Uniqueness: Are there any duplicates?
- Integrity: Is all data linked correctly? For example, are there orders linking to nonexistent customer IDs (a classic referential integrity problem)?
- Timeliness: Is the data current? For example, in a data warehouse with daily updates, I would expect yesterday’s data available today.
Source: Data Quality Implementation in Data Warehouses | Toptal