-
Marketplace
-
Channel Resources
Articles from this Site
Experian QAS Offers Newest Version of Core Batch Engine
DataFlux Unveils Improved Master Data Management Solution
Pitney Bowes Group 1 Software Unveils AuraT
Experian QAS Announced QAS Email and Phone
Rotary International Selects DataFlux
White Papers
Data Warehousing Ensuring Data Integrity
Making Data Work: Addressing Data Quality at the Enterprise Level
Can your SharePoint Backup Harm Your Business?
The Value Behind Integrity
Building Profitable Customer Relationships and Personalized Retention Strategies
Web Seminars
Master Data Management: Best Practices for Success
Getting In Synch: Creative Ways to Reconcile Data Between Apps
Closing the Loop: Real-Time Event Detection and Response
Books
Corporate Information Factory, 2nd Edition
The Data Warehouse Challenge: Taming Data Chaos
Data Quality for the Information Age
Improving Data Warehouse and Business Information Quality: Methods for Reducing Costs and Increasing Profits
Metadata Management for Information Control and Business Success
Data Quality: Fact and Perception
Strategic Insight
The fundamental rule for a data warehouse environment is that it contains the single version of truth for the enterprise but what does this mean? It comes down to data quality. When quality and integrity of information obtained from the data warehouse are questioned or believed to be wrong, the data warehouse is not an effective business tool. There are two aspects to the issue of data quality: fact (the data is right or wrong) and perception (users of the data do or do not find it useful for their needs). I find that issues of fact are understood (even though ensuring data quality is an ongoing challenge), but that issues based on users' perceptions are not.
First let's consider fact. The essential principle for data contained in a data warehouse is that it is the same as the representation of that fact contained in its operational source system of record this is what "truth" means for business intelligence. Every data element contained in the data warehouse is sourced and extracted from an operational system that is the best, most correct version of that data element in the company.
Data quality processes ensure that data elements brought into the data warehouse are "true" and are an essential part of the ETL process. Data elements are scrubbed so that they conform to corporate requirements for data format, usability and correctness. Data quality practices typically include:
- Creating a single definition for each data element critical to decision making and driving sales, service, operational and financial performance.
- Developing logical and physical models showing data relationships and organization.
- Defining key metrics, especially key performance indicators, that are used enterprise-wide.
- Integrating data from multiple sources and resolving differences to match the single accepted definition for data artifacts.
- Validating data extracted from operational source systems to determine the accuracy of information represented.
- Correcting inconsistent data.
- Creating corporate standards for data content.
- Treating data from external data providers as if it were your own, scrubbing it the same way, and judiciously matching it with internal data.
These important actions, while improving the quality of source data brought into the data warehouse, can also contribute to business users' perceptions about data quality.
While fact is quantitative (the data element is "true" or false), perceptions are based on requirements and use and are qualitative. This is often overlooked as an aspect to data quality: Does this data serve business users' needs? Is the quality of the users' experience with the data appropriate? It is important to address business users' qualitative expectations of the data warehouse.
These expectations typically include:
Data stability while data in the business environment can be volatile, there are certain activities such as simulation and predictive analysis that require data stability while algorithms and models are being developed. Not having the required data stability will lead to user dissatisfaction.
Data timeliness some users require the most up-to-date information while others are satisfied with data a week or even a month old. Meeting these diverse needs is as important as meeting those for data stability.
Data delivery equally important is how data is delivered. Some users prefer delivery through e-mail, some want PDF documents and others prefer their data delivered using a particular analytic tool.
Data manipulation many business users want the ability to manipulate data to create new metrics, find new data relationships and create their own analyses and reporting.
Data familiarity users expect data to match the data used in their operational source systems. If their data does not match that contained in the source system of record, it will appear unfamiliar and be perceived as wrong. This data inconsistency exists in a surprising number of operational source systems.
Perception is driven by the needs, expectations and uses of data. These are quality factors as important as the factors for data fact.
Who is responsible for data quality? Many organizations assign data ownership. This mechanism works well for operational source systems because the business community that uses one has a vested interest in keeping its data correct. This mechanism does not work for enterprise-wide data warehouses.
Who owns the data in the corporate data warehouse? If a data element is incorrect but it matches the data value in its source system of record, it is the operational source system that needs to be changed according to our definition of "truth." The corrected data element will then flow into the data warehouse. The ownership of the data in the corporate data warehouse, I believe, belongs to the BI competency center.
Data warehouse stewardship requires that each data element be equal to its source system of record and that the qualitative needs of users of the data warehouse are met. The BI competency center uses the data and BI to help the business accomplish its strategic objectives and is staffed with individuals skilled with technologies to do so. This approach makes data quality perception issues as important as fact issues. The data is owned by the organization that uses it and stewards it for the company the one with a vested interest to keep it right.
Richard Skriletz is the national managing principal for business intelligence and data warehousing at RCG Information Technology, a leading national provider of IT professional services. Skriletz has more than 30 years of experience in information technology and consulting services in the computer, insurance, banking, utilities, telecommunications and manufacturing industries. He passionately believes that business intelligence is a critical success factor for business today and that successful business intelligence requires a focus on improving business results and strategy. Skriletz may be reached via e-mail at rskrilet@rcgit.com.
For more information on related topics, visit the following channels:


