-
Marketplace
-
Channel Resources
Articles from this Site
Experian QAS Announced QAS Email and Phone
Rotary International Selects DataFlux
IBM Introduces New Versions of Two Software Products
Experian QAS Selected by Two State Unemployment Insurance Programs
Emerson Network Power Selects Silver Creek Systems
White Papers
Data Warehousing Ensuring Data Integrity
Making Data Work: Addressing Data Quality at the Enterprise Level
Can your SharePoint Backup Harm Your Business?
The Value Behind Integrity
Building Profitable Customer Relationships and Personalized Retention Strategies
Web Seminars
Master Data Management: Best Practices for Success
Getting In Synch: Creative Ways to Reconcile Data Between Apps
Closing the Loop: Real-Time Event Detection and Response
Books
Corporate Information Factory, 2nd Edition
The Data Warehouse Challenge: Taming Data Chaos
Data Quality for the Information Age
Improving Data Warehouse and Business Information Quality: Methods for Reducing Costs and Increasing Profits
Metadata Management for Information Control and Business Success
Data Quality: The Price of Entry
Knowledge: The Essence of Metadata
What exactly is data quality? The most obvious answer is that data quality represents the validity of the information. But that doesn't really tell the whole story of data quality.
Dimensions of Data Quality
Take a look at the following dimensions as described by the U.S. Accountability Office.
- Accuracy. The extent to which the data is free from significant error.
- Validity. The extent to which the data adequately represents actual performance.
- Completeness. The extent to which enough of the required data elements are collected from a sufficient portion of the target population or sample.
- Consistency. The extent to which data is collected using the same procedures and definitions across collectors and times.
- Timeliness. Whether data about recent performance is available when needed to improve program management and report to the business.
- Ease of use. How readily intended users can access data, aided by clear data definitions, user-friendly software and easily used access procedures.
Remember, the metadata business model constitutes two sets of customers: the producers of metadata information and the consumers. Who owns the responsibility of these six components of data quality? I will argue that the metadata services group completely owns numbers four, five and six, with influence ownership of number three. Ultimately, the responsibility for accuracy, validity and completeness belongs to those that produce the metadata information. As the broker of knowledge, we must focus on the other elements in order to deliver business value. If everyone focuses on their roles and responsibilities and ensures high performances, then the entire effort will be successful.
I believe that poor data quality is a symptom of the problem, not the problem itself. If bad data gets into the repository, then our data edits and business processes are broken. It is the metadata service group's responsibility to ensure that bad data never gets into the system to begin with. It's not that organizations are opposed to managing data quality, but that's the price of entry. You can't get information into the repository unless you sign off on the quality. The system shouldn't allow you to bring in logical and physical models, data transformations and database definitions unless they all match. If they don't, you simply return the data to the originator and have him correct it. Don't worry; after a few of these "return to sender" information loads, they will get the idea. Hey, the post office does this, so why can't we?
The key to doing a good job is getting it right the first time and maybe having a plan in place to deal with things when they don't go as planned. I shudder to even say anything about dealing with variances because a variance shouldn't be the norm. Systems that control data quality should be efficient and responsive to ensure the customer experience. The systems must ensure data quality on the front end, where the expense of repair is the lowest. The advantage of having systems in place is the elimination of the variation that inevitably comes with data management. The key is deciding what to automate and whom to empower.
Quality is no longer job one; it is assumed and expected. What happens when data quality is no longer special? What happens when the repository accurately reflects the data environment? Maybe poor data quality was acceptable in 1987, but 20 years later, we expect more out of our systems and technology environment. Should we strive for a Six Sigma accuracy rate of 99.9999 percent? Yes - a resounding yes would be more like it. Unfortunately, our technology, processes and architectures are working against us, but that will change over time.
Another thing about focusing on data quality is that it isn't very motivational. Well, maybe during the early stages, but the reality is that you will never reach perfection in data quality. Data ages, systems age and the objects they represent age, which means that any snapshot you take of metadata is inherently wrong. In a catalog of 100,000 assets with an average of 20 metadata fields, even a 99.99 percent accuracy rating generates 2,000 errors. Not to mention, you don't know which metadata elements are in error at any given time without a statistical audit. How are you going to motivate your team when you reach 99.99 percent? "Come on team, we can do it! Another .001 percent is all we need." Data quality is not your P&L (profit and loss); data quality is the price of entry into the business of data management.
R. Todd Stephens, Ph.D., is the director of Collaboration and Online Services for AT&T, located in Atlanta, Georgia. He has more than 20 years of experience in IT and speaks around the world on metadata, data architecture and information technology. Stephens recently earned his Ph.D. in information systems and has more than 70 publications in the academic, professional and patent arena. You can reach him via e-mail at Todd@rtodd.com or to learn more visit http://www.rtodd.com/.
For more information on related topics, visit the following channels:


