Portals eNewsletters Web Seminars dataWarehouse.com DM Review Magazine
DM Review | Information Is Your Business
   Information Is Your Business Advanced Search
advertisement

RESOURCE PORTALS
Business Intelligence
Compliance
Corporate Performance Management
Data Integration
Data Quality
Data Warehousing Basics
ETL
Master Data Management
View all Portals

WEB SEMINARS
Scheduled Events

RESEARCH VAULT
White Paper Library
Research Papers

CAREERZONE

Advertisement

INFORMATION CENTER
DM Review Home
Newsletters
Current Magazine Issue
Magazine Archives
DM Review Extended Edition
Online Columnists
Ask the Experts
Industry News
Search DM Review

GENERAL RESOURCES
Tech Evaluation Center:
Evaluate IT solutions
Bookstore
Buyer's Guide
Glossary
Industry Events Calendar
Software Demo Lab
Vendor Listings

DM REVIEW
About Us
Press Releases
Awards
Advertising/Media Kit
Reprints
Magazine Subscriptions
Editorial Calendar
Contact Us
Customer Service

Data Integration Adviser:
The Trial-and-Error Method for Data Architecture

  Column published in DM Review Magazine
January 2007 Issue
 
  By Rick Sherman

As a coach of my sons' soccer teams, I've noticed something that occasionally holds back the ambitious players. Rather than focusing on basics, such as learning good ball control, they are seduced by the lure of scoring. But without good basic skills, they don't have the ability to score.

The same thing happens in business. As a consultant, my job is to guide clients through all the stages of a business intelligence (BI)/data warehouse (DW) project. Ideally they avoid the common mistakes of a neophyte, but some businesses don't have the patience to learn. They want to jump right into the game and end up learning the expensive way: by trial and error.

Beginner Mistakes

Three of the common mistakes beginners make in architecting data include:

1. Letting enterprise applications inspire the architecture. BI/DW data architecture design, specifically DWs and data marts, should not be designed like the enterprise applications used as the source systems. But people design what they know.

That may be a quick way to design the DW, but it puts the emphasis on operational processing rather than BI. This can make the downstream data marts or reports add complex and time-consuming processing in order for the data to be consumable. And if there is only a DW, it is harder to build and deploy BI applications.

Many longstanding DWs have followed this approach. If you hear that the DW is too complex, makes it difficult to create BI queries and processes queries slowly, then the BI/DW team may have based its design on the enterprise application. In their defense, they may not have even realized it.

This approach was not designed from the business user's BI and performance management requirements but rather from the transactional processing requirements of the source systems. It doesn't take into account many of the best practices and methods that can be used to provide a truly significant business ROI.

Enterprise application vendors generally fall into this design trap when they initially build their DW, BI or corporate performance management (CPM) solutions because it reflects their frame of reference, but it can be just as damaging to your BI and CPM solutions.

2. Engaging in DW schema wars. The team designing the DW may fall into the "religious" war between having everything in third normal form (3NF) or in a dimensional model. As with the first challenge, the basic concern is that the design needs to be created based on business needs rather than an esoteric data modeling concept. Most DW environments need data stored in both 3NF and dimensional models, not either/or. The impact shifts data integration logic to the BI application, thus increasing time and costs to develop the BI solution.

The flip side is that the DW team decides everything should be stored in a dimensional model and not in a normalized manner. Although this is great for BI reports and analysis - hence, perfect for a data mart - if your DW needs to support history, has changing dimensions and you need to implement data integrity and quality, then the normalized form is the best practice under these conditions.

This mixed environment is really a DW utilizing a normalized form to store data historically and manage change data capture (CDC) with a data mart in a dimensional model to enable BI reporting and analysis. Nowadays these two approaches can be implemented within the same database using different logical areas, such as a schema in an Oracle environment.

3. Snubbing summary tables. I love it when someone tells me that they don't need a data mart or summary tables, relying on their DW tables and the associated database technology to provide all the performance they need. They wonder why they need to store the data again when they bought the best database and a terrific hardware platform with a lot of memory and fast disk arrays.

The first thing I do is check out the BI reports and analysis. The data has to be summarized and aggregated to start most analysis. If it is not done in the DW/data mart, then the BI report code is going to do it, making it much more costly to develop and maintain.

In addition, you'll also see some of these reports creating temporary summary tables to improve performance.  Instead of doing it once for everyone in a data mart, each BI report does it over again. Not only is this time-consuming, but also, every time someone different does it, the more likely it is that the numbers are going to differ.

Even if there are no summary tables, there may be BI cubes or, even worse, data shadow systems built by the business users to make up for this shortcoming in the data architecture.

A little patience goes a long way when a business is designing its data architecture. In future columns I'll discuss some of the pitfalls in data integration, BI, project management, production and ongoing support.  

...............................................................................

For more information on related topics visit the following related portals...
Data Management and Enterprise Achitecture.

Rick Sherman has more than 20 years of business intelligence and data warehousing experience, having worked on more than 50 implementations as a director/practice leader at PricewaterhouseCoopers and while managing his own firm. He is the founder of Athena IT Solutions, a Boston-based consulting firm that provides data warehouse and business intelligence consulting, training and vendor services. Sherman is a published author of over 50 articles, an industry speaker, a DM Review World Class Solution Awards judge, a data management expert at searchdatamanagement.com and has been quoted in CFO and Business Week. Sherman can be found blogging on performance management, data warehouse and business intelligence topics at The Data Doghouse.You can reach him at rsherman@athena-solutions.com or (617) 835-0546.

In addition to teaching at industry conferences, Sherman offers on-site data warehouse/business intelligence training, which can be customized and teaches public courses in the Boston area. He also teaches data warehousing at Northeastern University 's graduate school of engineering.

 

Solutions Marketplace
Provided by IndustryBrains

Speed Databases 2500% - World's Fastest Storage
Faster databases support more concurrent users and handle more simultaneous transactions. Register for FREE whitepaper, Increase Application Performance With Solid State Disk. Texas Memory Systems - makers of the World's Fastest Storage

Free DB Modeling Trial with ER/Studio
Design and Build More Powerful Databases with ER/Studio.

Database Access Tool
Reverse engineers relational schemas and automatically generates data access code that can be used in Service Oriented Architectures.

EMC - The easier way to archive is here
Minimize risk, control costs, and protect vital information with EMC's software archiving platform.

DeZign for Databases - Database Design Made Easy
Create, design & reverse engineer databases with DeZign for Databases, a database design tool for developers and DBA's with support for Oracle, MySQL, MS SQL, MS Access, DB2, PostgreSQL, InterBase, Firebird, NexusDB, dBase and Pervasive.

Click here to advertise in this space


View Full Issue View Full Magazine Issue
E-mail This Column E-Mail This Column
Printer Friendly Version Printer-Friendly Version
Related Content Related Content
Request Reprints Request Reprints
Advertisement
advertisement
Site Map Terms of Use Privacy Policy
SourceMedia (c) 2007 DM Review and SourceMedia, Inc. All rights reserved.
SourceMedia is an Investcorp company.
Use, duplication, or sale of this service, or data contained herein, is strictly prohibited.