Portals eNewsletters Web Seminars dataWarehouse.com DM Review Magazine
DM Review | Covering Business Intelligence, Integration & Analytics
   Covering Business Intelligence, Integration & Analytics Advanced Search

View all Portals

Scheduled Events

White Paper Library
Research Papers

View Job Listings
Post a job


DM Review Home
Current Magazine Issue
Magazine Archives
Online Columnists
Ask the Experts
Industry News
Search DM Review

Buyer's Guide
Industry Events Calendar
Monthly Product Guides
Software Demo Lab
Vendor Listings

About Us
Press Releases
Advertising/Media Kit
Magazine Subscriptions
Editorial Calendar
Contact Us
Customer Service

IQ and Muda: Information Quality Eliminates Waste

  Article published in DM Review Magazine
September 2005 Issue
  By Larry English

  • One bank had to scrap a $29 million data warehouse project because their original design was faulty. They had not designed the data acquisition process to capture data at the original sources, nor had they addressed the quality of information to support the data warehouse knowledge-workers.
  • Another bank lost more than $200 million in default loans when they approved credit based on faulty credit scores.
  • Yet another bank lost $600 million as the result of misinterpretation of a risk code when the company they invested in failed.

These examples of waste represent one of the primary objectives of information quality - to eliminate waste. The other key objective of IQ, as with any valid quality system, is to increase customer satisfaction.


Poor quality information causes processes to fail, creating cost of recovery, cost of fixing defective data and customer alienation that costs customer lifetime value.

An important way of looking at information quality management is from the perspective of results or outcomes. Information quality management eliminates waste. Poor quality information causes waste - waste of people's time, money, materials, facilities, equipment and, importantly, customers' time and money, which drives customers away, costing customer lifetime value.

The Japanese word for waste is muda. In Japan, muda refers to any activity that is not value-adding. Taiichi Ohno, who introduced the just-in-time (JIT) production system at Toyota Motor Company, was the first person to recognize the enormous amount of muda or waste in the everyday work activities of an organization.1

A business value chain is an end-to-end set of activities that operate on a product, service, document or set of information, making it ready for the next activity. The resources used in each activity either add value or they do not. The resources consumed (such as people time or equipment) that do not add value, add cost and are muda (waste).

There are nine types of muda in information quality:

  • Muda of overproduction
  • Muda of inventory
  • Muda of repair/rejects
  • Muda of motion
  • Muda of processing
  • Muda of transport
  • Muda of waiting
  • Muda of process failure caused by defective information
  • Muda of wrong or suboptimized decisions caused by defective information

The first seven types are described by Masaaki Imai in his book Gemba Kaizen.2 We will examine each type of muda, why it is muda and how it wastes the other resources of the enterprise.

Muda of Overproduction

In manufacturing, overproduction creates more products than needed at a point in time. It is often caused by fear of running out. However, producing more than "just-enough" or "just-in-time" inventory drives up the costs of inventory and results in scrap if demand does not equal supply - wasting all the value if the excess products must be scrapped or sold at a loss.

With information, overproduction comes in three forms: redundant systems, duplicate records and "hidden information factories."

Redundant systems. How many application systems are required to capture basic information about a customer? Exactly one. How many customer applications does your organization have that capture basic customer information? Subtract one from that number, and what is left represents muda of overproduction.

The causes of redundant systems can ultimately be traced back to the fact that most organizations are managed vertically. For example, a life insurance product line needs customer information. Thus, when we build the applications, we create application programs and databases to capture life insurance customer information. Then, the auto insurance line creates applications to capture auto insurance customer information and so on. The second through the twentieth or fiftieth times you capture information about the same person are muda. Why? You already know the person once you have captured information about him or her the first time. The second through the fiftieth captures of information are waste.

Duplicate records. Within a single system and database, records about the same real-world object (such as a person, product or location) may be captured multiple times. Again, this is waste. It often stems from pressures that force people to do work fast (productivity!) without allowing them to do things right. For example, call center operators may simply create new records for customers instead of taking the time to check if the customers are already on file.

This muda of overproduction creates new problems. Now, with two or more customer records on file for the same customer, you no longer know the customer or the customer's lifetime value. You waste money communicating with them multiple times. You encounter reconciliation problems trying to match people with similar names that can cause you to consolidate different people into the same record.

Hidden information factories. The manufacturing concept of hidden factories describes activities that are outside the normal production processes, often to perform private "scrap and rework" because the normal process does not have the capacity to perform some activity. These hidden factories often require their own (often unaccounted for) inventory.

With today's inexpensive personal computers, many knowledge-workers who cannot do what they need to do with their current systems build their own "hidden information factories" to support their information needs. They may download the data they require, keeping that data accurate and complete to meet their needs.

This muda of overproduction cannot be blamed on the knowledge-workers. They are writing a silent complaint that their information and functional needs are not being met by the production systems, and they have had to take matters into their own hands in order to perform their work. The muda here is significant. All the time they spend to build their own systems and databases is a waste because that time cannot be spent doing their "value" work. One of my clients has counted more than 16,000 hidden information factories around their company.

The costs associated with overproduction of information are significant. Costs of the muda include:

  • Development and maintenance costs of redundant applications.
  • Costs of interfacing information among interdependent systems.
  • Costs of matching, consolidating and purging data from multiple, disparately defined databases.
  • Operating, handling, marketing, processing costs of handling the multiple redundant or duplication information.
  • Lost or missed opportunity of not knowing customer relationships with different parts of the business.

Muda of Inventory

It is well known that inventory is a cost item. Finished products, parts and supplies kept in inventory are not value, but pure cost. They add additional cost by requiring space, additional facilities and equipment to handle and move them, and they can become scrap if they exceed shelf life or no demand is made for them.

Just-in-time inventory and lean manufacturing eliminate these unnecessary costs by matching production to demand in real time to eliminate the need for excessive inventory, warehouse and equipment space, etc.

Muda of information inventory is rampant in most organizations, bound up in tens or hundreds of redundant files and databases housing the same or same kind of information.

What is most criminal about this waste of inventory is that electronic information is the only non-consumable resource of the enterprise. If you have a customer record or a product record in a sharable database, all knowledge-workers who need it and have access can get it when they need it. The only valid business reason for a redundant copy is when data is moved to the data warehouse designed to support strategic and tactical processes that are not able to be supported efficiently by data in operational databases.

On average, large organizations have a single fact of data stored ten times or more in operational databases alone, not counting strategic (data warehouse) databases or the hidden information factories. This means that 90 percent of operational data is muda of inventory.

There are, or course, "assumed" arguments for redundancy in operational systems, such as: required for transaction performance, we do not share the same customers, our products are different, it will take too long [to build a shared database and common system], we cannot get support for an enterprise database, etc.

Rarely are these valid, justifiable business reasons for creating the enormous amounts of redundancy that waste valuable information systems professionals' time, money, office, computing and equipment resources.

The precipitating cause of redundant databases and applications is defective data and application development processes. The root cause, however, is managing information as a departmental resource, which is caused by managing the enterprise as an industrial organization that is managed by functions as opposed to value chains.

For quality, you must design data as an enterprise - not departmental - resource, designed around business subjects of information such as people and organizations, products, facilities and equipment, and financials, among others. You must then implement technology to support the operational demand without compromising the data design, including parallel processing, clusters of data records, networked computing, optimizing database performance and designing optimization techniques into the applications, etc.

Muda of Repair/Rejects

Manufacturing has costs of scrap and rework when processes produce defects. Business processes incur costs of "information scrap and rework" when they produce defective information.

Contrary to popular understanding, data cleansing (more appropriately called data correction) activities do not equate to information quality management. Data cleansing is information scrap and rework, a cost of nonquality information. Data correction is the cost of reworking information that was not created correctly at the source or was modified incorrectly, or information for which there was no process to keep the data current.

  • When knowledge-workers have to hunt and chase missing or inaccurate information, they are spending time in muda.
  • When you purchase expensive data cleansing software and use it only to correct defective data, you spend all that money on muda. Why? Information quality happens when you improve processes to prevent defective information and keep it correct.
  • A particularly problematic form of information scrap and rework muda occurs when knowledge-workers cannot trust the data in the production databases or cannot get to it. They may create their own hidden information factories so they can correct their own data and keep it to the degree of quality they need to perform their processes. Whatever you do, do not blame the knowledge-workers for this. Conduct a root-cause analysis to find out why this happens. Then you will find the broken processes that must be improved.

What is especially problematic about this muda is that only those with access to these hidden information factories benefit from the data correction. The best approach is when you encourage all knowledge-workers to make any data corrections in the source databases so that all stakeholders can benefit from the data correction.

The goal of information quality management is to eliminate the need to conduct excessive data cleansing activities by improving and controlling the processes that define information, create/update/move the information (where necessary and control the data movement) and present information to knowledge-workers who require it to perform their work.

The only quality way to approach data correction is to perform it as a one-time activity on data in a given database and to couple it with a process improvement initiative to eliminate the causes of defects. If you do not do this, you condemn yourself to performing data cleansing again and again, or you will embed a permanent data sanitation program into your business processes. This creates muda of processing, described later.

Muda of Motion

Muda of motion occurs in manufacturing when the arrangement of machines, materials and people are suboptimized, causing people to have to move further than necessary to carry out their work, or when assembly lines are unnecessarily long, causing them to break down or jam easily.

Muda of motion in information processes occurs when information producers have to key in more information than is necessary, or when information producers provide information on a form for intermediaries, such as data entry clerks, to key into a database. One of my pet peeves is providing the same information multiple times across different forms at healthcare providers' offices. When I first started seeing my dentist, her office asked patients to complete an information form once a year to keep their information current. I suggested that it would save patient time and office time if she would print out the patient information and have the patient verify and correct or update it as necessary. She now does this, saving three to five minutes per patient per year.

Applications that capture data can minimize data capture time by using several techniques:

  • Use postal data to capture addresses. By capturing only the street address and postal code, the city and state or province or locality (along with other data) can be derived. Make street address and post code a foreign key and save space in the address database as well.
  • Use check-boxes or pull-down menus for short lists of standard codes to prevent keystroke errors.
  • For long lists, such as standard industry code, create two-tier pull-down menus. The first is a list of general industry types; the second is a list of industry codes within a general industry type.
  • Design online screen hierarchies to minimize the number of screens information producers must navigate to get to screens required to capture or present information.
  • Use a "pull" approach for reports that can be requested on demand and sent to local printers as opposed to in-house mail delivery.
  • Minimize unnecessary intermediation by allowing the actual information producers to capture information electronically. For example, have field workers capture information electronically on laptops or other electronic devices to minimize information float and intermediation errors.

When information producers or knowledge-workers are required to take more motion to perform their work, they are not able to perform value-adding work.

Muda of Processing

In manufacturing, muda occurs when a step in the process does not add value as perceived by the customer. After they nearly went bankrupt, one of the most prestigious luxury automakers found they were polishing parts of the automobile and creating features that were irrelevant to their high-income customers. All of this was waste.

This is a category that challenges some of the most common practices in IT. The practices are those that create separate applications performing similar functions for different business or product lines in order to deliver applications quickly. Muda of information processing includes:

  • Multiple applications that capture the same or similar information, when one application would suffice. An insurance company that won a major contract with a large group decided to clone an existing system and database that provided almost identical products and features. Over time, this redundant system saw requirements changes that put the systems out of sync. Furthermore, regulatory requirements changes and other common enhancements required them to perform parallel maintenance on the two applications. All of the duplicate effort involved is waste.
  • Purchasing software packages causes you to pay for some functions you may not require (and the ongoing maintenance for the unneeded functions).
  • Capturing information manually on paper or forms, then giving the forms to intermediaries to re-capture that same information electronically. This transcription function requires capturing information twice, with the potential for errors to be introduced by the intermediation process due to handwriting illegibility, poor form design and synchronization problems between form and screen design. Minimize unnecessary intermediation by allowing the actual information producers to capture information electronically, such as having field workers capture information electronically on laptops or other electronic devices to minimize redundant data capture.
  • Production of reports that are no longer useful. A major railroad carrier eliminated scheduled batch reports, replacing them with menus of standard parameter-driven reports that could be produced easily by knowledge-workers, reducing report processing time by more than 80 percent and saving more than one ton of paper each month.
  • Putting process steps into value chains to perform batch data cleansing or batch job steps to test the validity of data is "muda of processing." Data capture processes should be validated and error-proofed to prevent defects that require such inspection and after-the-fact data correction.
  • Every application program that extracts operational data and transforms it to put in another operational database is muda, whether it is internally developed or a software package. Extracting and transforming electronic data already in your databases does not add value to your customers - it adds costs. Furthermore, it introduces complexity in keeping data consistent, especially when the data is in different formats. Again, the principle is that electronic data in a sharable database is non-consumable.
  • Often, knowledge-workers must create their own work-arounds when they do not have all the information they need to perform their work. Whether manual or electronic, these work-arounds may perform some value-add for the customer; however, the work-around activities to get to the value step are muda. These work-arounds may be steps that should exist in the formal process. If so, they should be standardized, error-proofed and incorporated in the mainline process.

While different business units or product lines may require slightly different information about their customers or products, there is absolutely no reason to have multiple application programs capturing common information such as telephone numbers, addresses, business names and person names.

There is no valid business rationale for having different code values for the same fact, such as marital status or country code. If there were, it would be okay for business managers to create their own budget codes for their departmental budgets and to ask the finance office to integrate their individual budgets into the enterprise budget!

Note: This is what the data warehouse team must do to integrate data into the data warehouse. All of this consolidation activity is muda. It adds no value whatsoever if the common data could be captured and maintained by a single application module. It is pure cost of defective data definition and application development processes.

Consider the further muda caused by business personnel having to learn different code values when going from business area to business area. Or, consider the large insurance company that had overlapping marital status codes across different business areas. In one area, a marital status of "s" identified a "single" person," while in another area, "s" represented a "separated person!"

Muda of Transport

In manufacturing, the movement of physical materials and products is handled by trucks, forklifts, conveyers, etc. While transport is necessary, it adds costs, not value. At the same time, damage can occur during transport.

In information quality, muda of transport occurs with every data movement that extracts data from one database and often transforms it and loads it into another operational database. This does not add value, it only adds costs. Data movement processes create a new point in which errors can be introduced, and create time lag in the concurrence of data in redundant databases. Concurrence is the time difference in which data is equivalent from one database to another.3

An important division of a Fortune 500 company analyzed the time sheets of its application developers and found that from 48 to 63 percent of their 250 application developers' time was spent in maintaining interface programs that moved data from one database to another! This company was spending more than half of its application personnel cost in performing planned data movement muda.

A telco found they were spending $100 million each year to fix problems where data failed to get moved properly from one system to another and to fix the interface programs.

Because electronic information is the only non-consumable resource, there is absolutely no business justification for redundant databases and data movement among operational applications, except where data cannot be captured at the point of origin in sharable databases.

The more I have studied information value chains that trace the processes that create, update, move, transform and deliver information to knowledge-workers, the more I can only refer to them as "information value/cost chains," for more money is spent in valueless data movement and transformation than in actual value-adding work. In one insurance company information value/cost chain, a single data element changed names seven times and the valid value set changed three times!

Muda of Waiting

Muda of waiting is the waste of people's time when they have to wait for a part, materials or information they require to perform their work or make decisions.

The data movement in the muda of transport creates an additional problem called information float - the delay of time from when information is known to someone in the enterprise to when that information is able to be accessed by other stakeholders.

Information float forces people to wait before they can perform their value work in a value chain. They may or may not be able to occupy their time with other value-adding work in the meantime. We have seen information float time from seconds (in near real-time data propagation) to days or weeks - even months in extreme cases.

Delay in timely information incurs costs of missed or lost opportunity beyond the idleness of people time. Investment or futures transactions can incur considerable added costs or lost profits if the competitor has more timely information. One freight shipping company can incur costs as high as $1 million per hour waiting at a port if it lacks current information on the freight in its cargo containers, which might change owners three or four times during the shipping process!

Muda of Process Failure

Waste in manufacturing happens when defective components, parts or materials cause the process to fail, producing defective products. While the scrap and rework are muda described in the muda of repair/rejects, another form of muda occurs if it affects the customer. The best-case scenario here is if products have to be recalled after they are discovered to be faulty. However, the greater cost is when those defects cause injury or death to the customer. This results in unrecoverable damage and legal liability and costs.

Waste as a result of defective information (whether it is unclear definition of the data, inaccurate or missing information, or ambiguously presented information) generally incurs both the muda of repair/rejects as well as the muda of process failure.

  • The duplicate catalogs sent to a single household cost a mail-order company $3 to $5 million per year over several years in printing, mailing and sales loss because they had undetected duplicate records (30 percent) in their customer database.
  • The outage of telephone center network cost several telemarketing organizations more than $1 million per day as the result of software bugs in a network software upgrade.
  • Name misspelling, incorrect addresses, incorrect order filling and/or invoicing can cost customer lifetime value when customers take their business elsewhere.

Information quality seeks to improve the processes to eliminate the information defects that cause process failure and the costs incurred as a result. It is relatively easy to measure the costs of lost customer lifetime value if you track customer complaints.

Muda of Suboptimized or Wrong Decisions

Decisions are rarely made in a vacuum. Management usually requires as much information as is available depending on the significance of a decision. If inaccurate or incomplete information is trusted and used to make decisions, the decisions made will probably be sub-optimized at best, or wrong at worst.

The costs of such muda are more difficult to calculate if they involve new opportunities, such as entering a new market, or mergers or acquisitions. However, in some cases, they may be very tangible. A major bank acquired a regional financial institution. Believing they had a good synergy, they acquired the institution without an appropriate due diligence that would have found liabilities. Included in the liabilities was the vast overstatement of the number of customers because they had independent customer databases for each of their product lines, with significant overlap in customers with multiple products.

After two years, the acquisition failed to produce the expected gains. In the end, the acquiring bank had to shut down all the acquired institution's branches and take a $2.7 billion write-down. Now that's a lot of muda!

The Real Problem

The real problem with poor quality information is not the poor quality information itself. It is about the costs of the waste caused by poor quality and about customer alienation and lost customer lifetime value. The real business case for information is to be found in measuring the costs of poor quality information and improving processes to prevent the defective information and the muda and costs associated with it.

Does your information quality function address muda elimination for the muda caused by defective information processes and defective information? If not, should it?

Measure the costs of muda of poor quality information. See "Measuring Nonquality Information Costs," in Improving Data Warehouse and Business Information Quality to learn how to do so.4 Then, prioritize and start improving the high priority processes. Measure your improvement in the muda eliminated, the increased customer satisfaction and the money returned to the bottom line.

What do you think? Let me know at Larry.English@infoimpact.com.


  1. Imai, Masaaki. Gemba Kaizen. New York: McGraw-Hill, 1997. pp. 8, 75.
  2. Ibid., pp. 75-86.
  3. English, Larry P. Improving Data Warehouse and Business Information Quality. New York: John Wiley & Sons, 1999. p. 143.
  4. Ibid., Chapter 7, pp. 199-235.

For more information on related topics visit the following related portals...
Data Quality.

Larry P. English is president and principal of INFORMATION IMPACT International, Inc., Brentwood, Tennessee, and the author of the widely acclaimed book, Improving Data Warehouse and Business Information Quality: Methods for Reducing Costs and Increasing Profits. English is cofounder of the International Association for Information and Data Quality (www.iaidq.org). English is an internationally recognized speaker, teacher, consultant and author and may be reached at larry.english@infoimpact.com or through his Web site at www.infoimpact.com. For more on how to improve your IQ principles and techniques, and prevent your organization from wasting millions in information scrap and rework, join the IAIDQ (visit www.iaidq.org).

Solutions Marketplace
Provided by IndustryBrains

Data Quality Tools, Affordable and Accurate
Protect against fraud, waste and excess marketing costs by cleaning your customer database of inaccurate, incomplete or undeliverable addresses. Add on phone check, name parsing and geo-coding as needed. FREE trial of Data Quality dev tools here.

Design Databases with ER/Studio: Free Trial
ER/Studio delivers next-generation data modeling. Multiple, distinct physical models based on a single logical model give you the tools you need to manage complex database environments and critical metadata in an intuitive user interface.

Free EII Buyer's Guide
Understand EII - Trends. Tech. Apps. Calculate ROI. Download Now.

Click here to advertise in this space

View Full Issue View Full Magazine Issue
E-mail This Article E-Mail This Article
Printer Friendly Version Printer-Friendly Version
Related Content Related Content
Request Reprints Request Reprints
Site Map Terms of Use Privacy Policy
SourceMedia (c) 2006 DM Review and SourceMedia, Inc. All rights reserved.
SourceMedia is an Investcorp company.
Use, duplication, or sale of this service, or data contained herein, is strictly prohibited.