Portals eNewsletters Web Seminars dataWarehouse.com DM Review Magazine
DM Review | Covering Business Intelligence, Integration & Analytics
   Covering Business Intelligence, Integration & Analytics Advanced Search
advertisement

RESOURCE PORTALS
View all Portals

WEB SEMINARS
Scheduled Events

RESEARCH VAULT
White Paper Library
Research Papers

CAREERZONE
View Job Listings
Post a job

Advertisement

INFORMATION CENTER
DM Review Home
Newsletters
Current Magazine Issue
Magazine Archives
Online Columnists
Ask the Experts
Industry News
Search DM Review

GENERAL RESOURCES
Bookstore
Buyer's Guide
Glossary
Industry Events Calendar
Monthly Product Guides
Software Demo Lab
Vendor Listings

DM REVIEW
About Us
Press Releases
Awards
Advertising/Media Kit
Reprints
Magazine Subscriptions
Editorial Calendar
Contact Us
Customer Service

Stranded on Islands of Data

  Article published in DM Review Magazine
November 1998 Issue
 
  By David Marco

There is a severe disease that has spread to epidemic proportions throughout our society. This disease is particularly dangerous as its effects are not readily identifiable at the time of infection. However, if this condition goes untreated, it can be debilitating and even terminal. This disease is not hepatitis, but rather "independent" data marts. While this imagery may seem a bit dramatic, unfortunately it reflects the reality in many of today's companies.

This article is the first of a two-part series on migrating from independent data marts to an architected solution. This installment will address the characteristics of independent data marts, the flaws in their architecture and the reasons why they exist. Part two will run in the December issue of DM Review and will address specifically how a company can migrate off of the independent data mart architecture to an architected solution.

Characteristics of Independent Data Marts

Independent data marts are characterized by several traits. First, each data mart is sourced directly from the operational systems without an enterprise data warehouse to supply the architecture necessary to sustain and grow the data marts. Second, these data marts are typically built independently from one another by autonomous teams. Typically, these teams will utilize varying tools, software, hardware and processes.

Possibly the most visually descriptive trait of a company that has constructed independent data marts is that once they map out a schema of their decision support systems (DSSs), the schema will resemble that of a "spaghetti" chart (see Figure 1).* What is most disturbing is the number of companies that have expressed that this chart resembles their current DSS architecture.

Obviously, this architecture is not an architecture at all. Instead it is a series of "stovepipe" DSS systems. This architecture greatly differs from that of an architected data warehouse (see Figure 2).

The purpose of this article is to discuss independent data marts and the process for migrating to an architected solution. However, it will briefly touch on the topic of DSS architecture. It will not go into a detailed discussion of top-down versus bottom-up approaches, except to say that the "classic" top-down approach is a more scalable and logical approach for constructing a DSS system. It is surprising how often the top-down methodology is mistaken for a "galactic" approach. This is a misunderstanding as the top-down approach is best used iteratively and incrementally to build the DSS system. When used in this fashion, the cost of building a data warehouse that feeds "dependent" data marts becomes highly comparable to the cost of building independent data marts.

Problems With Independent Data Marts

Redundant Data:
As the number of independent data marts grows, the amount of redundant data begins to grow uncontrollably across the enterprise. This redundancy occurs because each of the independent data marts requires its own, typically duplicated copy of the detailed corporate data. Often a great deal of this detailed data is not required in the data marts, which typically provide summarized views.

It would be enlightening if a study were conducted to calculate the costs of maintaining non-necessary redundant data for Fortune 1000 companies. The end total would be in the billions of dollars in expenses and lost opportunity.

Redundant Processing:
A data warehouse provides the architecture to centralize integration and cleansing activities common to all of the data marts of a company. Without the data warehouse, all of these integration and cleansing processes need to be duplicated for all of the independent data marts. This greatly increases the number of support staff required to maintain the DSS system, creating a particularly disastrous situation for most companies in light of today's IT staffing shortage.

Separate teams will typically build each of the independent data marts in isolation. As a result, these teams do not leverage the other's standards, processes, knowledge and lessons learned. This results in a great deal of rework.

These autonomous teams will commonly select different tools, software and hardware. This forces the enterprise to retain skilled employees to support each of these technologies. In addition, a great deal of financial savings is lost, as standardization on these tools doesn't occur. Often a software, hardware or tool contract can be negotiated to provide considerable discounts for enterprise licenses. These economies of scale can provide tremendous cost savings to the organization.

Scalability:
Independent data marts directly read operational system files and/or tables, which greatly limits the DSS system's ability to scale. For example, if a company has five independent data marts, it is likely that each data mart would require customer information. Therefore, there would be five separate extracts being pulled off of the same customer tables in the operational system of record. Most operational systems have limited batch windows and cannot support this number extracts. With a data warehouse, only one extract is required in the operational system of record.

Non-Integrated:
As previously discussed, each independent data mart is built by autonomous teams, typically working for separate departments. As a result, these data marts are not integrated and none of them contain an enterprise view of the corporation. Therefore, if the CEO asks the IT department to provide a "listing of our most profitable customers," each data mart will offer a different answer. Having worked with companies that have experienced this exact situation, I can attest that the CIO is rarely pleased to have to explain why his department cannot answer this seemingly simple question.

One of the chief phenomena facing corporations today is the current merger and acquisition craze. Interestingly enough, one of the key factors fueling this movement is these companies' desire to reduce their IT spending. In light of this situation, the costs associated with independent data marts become even more magnified as companies continue to focus on controlling their ever-growing IT costs.

It is important to note that many companies that have built independent data marts are currently in the process of migrating off of them. Needless to say, the cost--in dollars and time--for the migration is not trivial.

Why Do Independent Data Marts Exist?

With all of these architectural flaws, it would seem surprising that so many companies have built their DSS systems around this architecture. There are several reasons why this aberration has occurred.

DSSs Are Complex:
When the decision support craze spread, most companies were looking to build a data warehouse of their own. Unfortunately, the task of building a well-architected and scalable business intelligence system is complicated and requires sophisticated software, expensive hardware and a highly skilled and experienced team. Finding data warehouse architects and project leaders that truly understand data warehouse architecture is a daunting challenge, both in the corporate and consulting ranks.

In order to construct a data warehouse, a corporation must truly come to terms with their data and the business procedures that the data represents. While this task is challenging, it is a necessary step and one from which the true value of the DSS process is derived.

Independent Data Mart Shortcut:
Building independent data marts is less expensive than building architected decision support systems. In addition, independent data marts can be constructed fairly quickly and, unlike a data warehouse, do not require a company to really understand their data beyond that of individual departments. These points have been effectively used to sell the concept of constructing independent data marts. Unfortunately, it is this lack of thorough analysis and long-term planning that limits the independent data marts from being an effective business intelligence system.

Inappropriate Vendor Messages:
Many vendors have developed tools that are effective at building small, departmental independent data marts. These companies in their rush to market with these tools have worked very hard at selling the independent data mart concept (of course, it is never worded like this). The reasons are obvious. These companies can significantly reduce their sales cycles because only one department is involved in the software purchasing decision. In addition, their software requires much less sophistication because they merely need to build a standalone data store.

The current vendor buzzword in today's market is "turnkey." Everyone seems to offer a "turnkey" DSS solution. Unfortunately, merely purchasing a "turnkey" solution does not alleviate the task of learning and understanding a corporation's data and their business processes. Integration of data from disparate systems requires a careful analysis and an understanding of business processes and the data that represents them. There isn't a "magic bullet" or "turnkey" solution that alleviates this task.

In the December issue of DM Review, the two approaches for migrating from independent data marts and an independent data mart migration case study will be presented.

* It is important to note that this chart is an actual client's DSS architecture schematic. I'm proud to say that they are no longer on this architecture.

...............................................................................

For more information on related topics visit the following related portals...
Data Marts, DW Design, Methodology and Business Intelligence (BI).

David Marco is an internationally recognized expert in the fields of enterprise architecture, data warehousing and business intelligence and is the world's foremost authority on meta data. He is the author of Universal Meta Data Models (Wiley, 2004) and Building and Managing the Meta Data Repository: A Full Life-Cycle Guide (Wiley, 2000). Marco has taught at the University of Chicago and DePaul University, and in 2004 he was selected to the prestigious Crain's Chicago Business "Top 40 Under 40."  He is the founder and president of Enterprise Warehousing Solutions, Inc., a GSA schedule and Chicago-headquartered strategic partner and systems integrator dedicated to providing companies and large government agencies with best-in-class business intelligence solutions using data warehousing and meta data repository technologies. He may be reached at (866) EWS-1100 or via e-mail at DMarco@EWSolutions.com.

Solutions Marketplace
Provided by IndustryBrains

Data Validation Tools: FREE Trial
Protect against fraud, waste and excess marketing costs by cleaning your customer database of inaccurate, incomplete or undeliverable addresses. Add on phone check, name parsing and geo-coding as needed. FREE trial of Data Quality dev tools here.

Speed Databases 2500% - World's Fastest Storage
Faster databases support more concurrent users and handle more simultaneous transactions. Register for FREE whitepaper, Increase Application Performance With Solid State Disk. Texas Memory Systems - makers of the World's Fastest Storage

Manage Data Center from Virtually Anywhere!
Learn how SecureLinx remote IT management products can quickly and easily give you the ability to securely manage data center equipment (servers, switches, routers, telecom equipment) from anywhere, at any time... even if the network is down.

Design Databases with ER/Studio: Free Trial
ER/Studio delivers next-generation data modeling. Multiple, distinct physical models based on a single logical model give you the tools you need to manage complex database environments and critical metadata in an intuitive user interface.

Free EII Buyer's Guide
Understand EII - Trends. Tech. Apps. Calculate ROI. Download Now.

Click here to advertise in this space


View Full Issue View Full Magazine Issue
E-mail This Article E-Mail This Article
Printer Friendly Version Printer-Friendly Version
Related Content Related Content
Request Reprints Request Reprints
Advertisement
advertisement
Site Map Terms of Use Privacy Policy
SourceMedia (c) 2006 DM Review and SourceMedia, Inc. All rights reserved.
SourceMedia is an Investcorp company.
Use, duplication, or sale of this service, or data contained herein, is strictly prohibited.