Portals eNewsletters Web Seminars dataWarehouse.com DM Review Magazine
DM Review | Covering Business Intelligence, Integration & Analytics
   Covering Business Intelligence, Integration & Analytics Advanced Search

View all Portals

Scheduled Events

White Paper Library
Research Papers

View Job Listings
Post a job


DM Review Home
Current Magazine Issue
Magazine Archives
Online Columnists
Ask the Experts
Industry News
Search DM Review

Buyer's Guide
Industry Events Calendar
Monthly Product Guides
Software Demo Lab
Vendor Listings

About Us
Press Releases
Advertising/Media Kit
Magazine Subscriptions
Editorial Calendar
Contact Us
Customer Service

Reality IT:
Data Mining – If Only It Really Were about Beer and Diapers

online columnist Gabriel Fuchs     Column published in DMReview.com
July 1, 2004
  By Gabriel Fuchs

At my job, we use data mining tools in order to figure out what the heck is really going on. Data mining has been around for quite some time now. About 10 years ago it was even considered by many BI vendors to be the "next big thing" after ad hoc querying and OLAP tools. Who has not heard about the beautiful example of the supermarket that wanted to know what product they sold first and foremost with diapers? Well, they mined the database that stored all the customer transactions and, to their big surprise, it turned out that beer was the product most often sold with diapers. On top of that, these purchases were made mainly on Friday afternoons by men between the ages of 25 and 35. After some serious thinking, the supermarket figured out the rationale was that because diapers are voluminous, the wife, who in most cases made the household purchases, left the diaper purchase to her husband who had the car. The husband and father, most often between 25 and 35 years old, usually bought the diapers at the end of the working week. With the weekend, beer often becomes a priority; and so, beer became the product most often associated with the sale of diapers.

What did the supermarket do as a consequence? They put the beer display next to the diapers. The result was that the fathers buying diapers and who also usually bought beer now bought even more beer, as it was so conveniently placed next to the diapers. The ones that did not buy beer before began to purchase it when it was so visible and handy - just next to the diapers. Beer sales skyrocketed.

This story exists in several different versions and sometimes it is about 7 Eleven, sometimes about Wal-Mart. Sometimes it is not even about data mining, but about the benefits of data warehousing. It is a nice story for promoting data mining, but with the risk of disappointing many data mining fans, it would seem that it is not true. I have yet to be told this story by someone who was actually there and not by someone who heard it from someone who knows someone who seemed to have been there.

Even if the beer and diapers example may not be true, it is somewhat surprising that data mining has not really taken off as was predicted. The science is mature. Some of the data mining algorithms that are commonly used today were created 30 years ago, and data mining software has been around for quite some time. In other words, there are relatively stable products around. Also, some of the solutions offered no longer demand that the end user has a Ph.D. in advanced mathematics in order to use them (and to understand why many men like beer). So why is it that data mining has not had the breakthrough in the BI market? I mean, look at it: who does not want automated solutions that can tell you what is actually going on? So what if the data preparation is a major issue or that you need some skills in order to handle a data mining tool, the efforts in implementing a data warehouse are far bigger. And the users that can efficiently handle ad hoc querying tools or OLAP solutions do not exist in abundance either.

You could figure out that beer is the preferred product with diapers, or whatever, with reporting tools alone. In such a case, the user does, however, need to know in advance to look for such possible relations. Data mining can automate all this. (Who does not want a convenient life where someone or something else does the job? Who would not be lazy if only it was possible?)

At the same time, it appears that organizations that actually use data mining are reaping huge benefits. These companies most often find themselves in highly competitive markets, such as telecommunication, big volume retail or banking. Just imagine what hidden relations could be uncovered and used for improving the business. What if a mobile phone service finds out that there is an increase of phone calls from their married customers to other married customers at very odd hours? This could be translated into some really interesting and innovative business opportunities, such as an offer to hide such dialed numbers from the detailed phone bill. You know, even things that might be considered immoral by some, do sell. If you do not believe this, some data mining analyses could prove the point and therefore convince you.

Even though data mining will not find all the truths and business opportunities, it can and does find examples similar to the beer and diaper connection. Even if it may not be true (just think about it: which supermarket has actually put their beer shelf next to the diapers?), maybe supermarkets really should start to market beer and diaper together. That would make a truly good story true.


For more information on related topics visit the following related portals...
Data Mining.

Gabriel Fuchs is a senior consultant with IBM. His column Reality IT takes an ironic look at what real-world IT solutions often look like - for better or for worse. The ideas and thoughts expressed in this column are based on Fuch's own personal experience and imagination, and do not reflect the situation at IBM. He can be reached at gabriel.fuchs@ch.ibm.com.

Solutions Marketplace
Provided by IndustryBrains

Autotask: The IT Business Solution
Run your tech support, IT projects and more with our web-based business management. Optimizes resources and tracks billable project and service work. Get a demo via the web, then try it free with sample data. Click here for your FREE WHITE PAPER!

Data Quality Tools, Affordable and Accurate
Protect against fraud, waste and excess marketing costs by cleaning your customer database of inaccurate, incomplete or undeliverable addresses. Add on phone check, name parsing and geo-coding as needed. FREE trial of Data Quality dev tools here.

Design Databases with ER/Studio: Free Trial
ER/Studio delivers next-generation data modeling. Multiple, distinct physical models based on a single logical model give you the tools you need to manage complex database environments and critical metadata in an intuitive user interface.

Email Regulatory Compliance
E-Trail Digital Archive is a feature rich, turnkey Electronic Communications Retention, Retrieval and Supervisory system.

Free EII Buyer's Guide
Understand EII - Trends. Tech. Apps. Calculate ROI. Download Now.

Click here to advertise in this space

E-mail This Column E-Mail This Column
Printer Friendly Version Printer-Friendly Version
Related Content Related Content
Request Reprints Request Reprints
Site Map Terms of Use Privacy Policy
SourceMedia (c) 2006 DM Review and SourceMedia, Inc. All rights reserved.
SourceMedia is an Investcorp company.
Use, duplication, or sale of this service, or data contained herein, is strictly prohibited.