Portals eNewsletters Web Seminars dataWarehouse.com DM Review Magazine
DM Review | Covering Business Intelligence, Integration & Analytics
   Covering Business Intelligence, Integration & Analytics Advanced Search

Resource Portals
Analytic Applications
Business Intelligence
Business Performance Management
Data Integration
Data Quality
Data Warehousing Basics
More Portals...


Information Center
DM Review Home
Conference & Expo
Web Seminars & Archives
Current Magazine Issue
Magazine Archives
Online Columnists
Ask the Experts
Industry News
Search DM Review

General Resources
Industry Events Calendar
Vendor Listings
White Paper Library
Software Demo Lab

General Resources
About Us
Press Releases
Media Kit
Magazine Subscriptions
Editorial Calendar
Contact Us
Customer Service

Data Warehousing Lessons Learned:
Don't Let This Happen to Your Data Warehouse

  Column published in DM Review Magazine
January 2002 Issue
  By Lou Agosta

During the summer and autumn of 2001, Giga Information Group asked data warehousing people at financial service institutions, insurance companies, retailers, manufacturers, transportation companies, and organizations in the private sector and education to let Giga take a look at the weaknesses in data warehousing solutions that they had implemented. Giga also asked what they would do differently if they had the chance. The result is a revealing, sometimes painful and often humorous look at the uses and abuses of data warehousing in the real world.

It would be an exaggeration to say that in every instance Giga has an easy solution. However, in every instance, there is a lesson learned and a best practice that can be inferred. In every case, it is important for firms building or operating data warehouses to understand what can go wrong. This enables action to be taken to reduce the risk of surprises such as schedule slips, disappointed customers or budget overruns due to technology miscalculations. This, in turn, enables implementation of state-of- the-art data warehouses that support high-impact business applications in CRM, supply chain management and business intelligence.

One data warehouse implementer reports: The major weakness of our data warehouse is its lack of data. This organization has many different kinds of data, but the data warehouse focuses only on customer billing information. There is no order, delivery, warranty or detailed line-item data. The main reason for not having this data, according to the client, is lack of justification for expending the IT effort to access the data stores.

A data warehouse is no better than its access to upstream transactional systems. The situation described by this client vignette occurs when the data warehouse lacks access to the diverse operational systems that hold the transactions for order completion, delivery and detailed line items. The client suffers from the latter. There also is evidence of organizational obstacle ? possibly territorial behavior on the part of some operational systems whereby access to the required data is denied. The direct approach is to analyze the business problem which, in this case, highlights the additional data required to get a more complete view of the customer. Next, design and implement dimensions that capture and forward the information from the operational systems to the data warehouse. Finally, use that information to address the business issue (not explicitly stated here) such as selling additional products to those whose warranty data indicates the product is at the end of its useful life. Lesson learned: A basic architectural principle can usefully be employed here ? the data without the application is meaningless, and the application without the data is empty.

In a related but slightly different problem, a respondent complains: The data warehouse contains the "raw" data ? mostly extracted from operational systems. We had very limited design and little effort was spent on rolling up or summarizing the data. It is very complicated to collect all the needed information ? and to integrate and aggregate it. As a result, every new source takes a lot of effort, which has prevented maximum use and business benefit from the warehouse. Along the same lines: The data warehouse was built, not designed. We keep way too much detail and do not summarize enough data into prebuilt tables. Too many of our queries are ad hoc and take long times to run because of the detail level in the tables. The folktale of the three little pigs and the wolf comes to mind. All three of the pigs built their houses ? only one designed his.

This is a slightly different issue, but related to the first. In the first example, the data warehouse lacked data. In the second, the data warehouse lacks information. Unlike the first example, there is plenty of data ? but not enough information. The irony is that not enough effort was expended at the front end to design aggregates; therefore, extra effort must be incurred at the back end to collect, integrate and aggregate all the information needed. The good news is that if the detailed data is available, building aggregates should be relatively easy. Indeed, knowing that 20 percent of the data gets used 80 percent of the time is often not obvious until an installation gets some experience under its belt. Although this is not a discussion about tools, getting technology to monitor database usage can be helpful at this point. Aggregates are one of the chief performance enhancers for any data warehouse; therefore, justifying their creation and maintenance is relatively easy to do. Adding business views and summaries (e.g., by time period, business unit, product family, etc.) should be a priority. Also, the introduction and use of data mining technology/tools to discover unforeseen relationships and correlative factors would greatly enhance the perception of value, use of the warehouse and provide better insight into the business. Lesson learned: Design the data warehouse; don't just build it.


For more information on related topics visit the following related portals...
DW Administration, Mgmt., Performance, DW Design, Methodology and DW Basics.

Lou Agosta is the lead industry analyst at Forrester Research, Inc. in data warehousing, data quality and predictive analytics (data mining), and the author of The Essential Guide to Data Warehousing (Prentice Hall PTR, 2000). Please send comments or questions to lagosta@acm.org.



Solutions Marketplace
Provided by IndustryBrains

Bowne Global Solutions: Language Services
World's largest language services firm offers translation/localization, interpretation, and tech writing. With offices in 24 countries and more than 2,000 staff, we go beyond words with an in depth understanding of your business and target markets

Award-Winning Database Administration Tools
Embarcadero Technologies Offers a Full Suite of Powerful Software Tools for Designing, Optimizing, Securing, Migrating, and Managing Enterprise Databases. Come See Why 97 of the Fortune 100 Depend on Embarcadero!

Online Backup and Recovery for Business Servers
Fully managed online backup and recovery service for business servers. Backs up data to a secure offsite facility, making it immediately available for recovery 24x7x365. 30-day trial.

Data Mining: Strategy, Methods & Practice
Learn how experts build and deploy predictive models by attending The Modeling Agency's vendor-neutral courses. Leverage valuable information hidden within your data through predictive analytics. Click through to view upcoming events.

Test Drive the Standard in Data Protection
Double-Take is more affordable than synchronous mirroring and enables you to recover from an outage more quickly than tape backup. Based upon the Northeast blackout and the west coast wild fires, can you afford to be without it?

Click here to advertise in this space

View Full Issue View Full Magazine Issue
E-mail This Column E-Mail This Column
Printer Friendly Version Printer-Friendly Version
Related Content Related Content
Request Reprints Request Reprints
Site Map Terms of Use Privacy Policy

Thomson Media

2005 The Thomson Corporation and DMReview.com. All rights reserved.
Use, duplication, or sale of this service, or data contained herein, is strictly prohibited.