Portals eNewsletters Web Seminars dataWarehouse.com DM Review Magazine
DM Review | Covering Business Intelligence, Integration & Analytics
   Covering Business Intelligence, Integration & Analytics Advanced Search

Resource Portals
Analytic Applications
Business Intelligence
Business Performance Management
Data Integration
Data Quality
Data Warehousing Basics
More Portals...


Information Center
DM Review Home
Conference & Expo
Web Seminars & Archives
Current Magazine Issue
Magazine Archives
Online Columnists
Ask the Experts
Industry News
Search DM Review

General Resources
Industry Events Calendar
Vendor Listings
White Paper Library
Software Demo Lab

General Resources
About Us
Press Releases
Media Kit
Magazine Subscriptions
Editorial Calendar
Contact Us
Customer Service

Data Warehousing Lessons Learned:
Hub-and-Spoke Architecture Most Popular for Data Warehousing

  Column published in DM Review Magazine
December 2003 Issue
  By Lou Agosta

According to the recent survey with our partners at The Data Warehousing Institute (TDWI), the most frequently implemented architecture is the data warehouse with attached data marts (43 percent of respondents). This corresponds most closely to what is also described as a hub-and-spoke architecture ?- a central data store with attached (dependent) data marts. Also of interest is a large group (19 percent) that is committed to centralized data warehousing pure and simple, and a significant number (11 percent) who are not sure what they have.

Interestingly, five percent of respondents report that "other" best describes their architecture. The first five options -? federated data warehouse and marts through non-conformed data marts (highly distributed), etc. ?- are exhaustive distinctions. Therefore, if an enterprise claims to have a hybrid of these, it means it really does not know what to call the spaghetti-like structures that are its implied architecture.

These results suggest that enterprises are planning centrally, but occasionally decide to or are forced to make compromises. Sometimes data marts represent a compromise forced on a central design by issues such as the need for an interim deliverable, an incremental result, a response to a powerful political constituency that wants its own system or performance considerations. Data warehousing architectural options are many and varied. However, five main overall patterns are found both logically and practically. A firm can operate a distributed data warehouse architecture, in effect, by moving and translating data between nodes in a network, the nodes of which are data stores (data marts only). A firm can build the centralized consolidation data warehouse only or operational data store (ODS), or a centralized data warehouse leveraging a hub-and-spoke form with attached data marts. A firm can build a federated system of distributed warehouses that in its more successful implementations also sometimes exploits a data hub, but without a persistent centralized data store. Or, a firm can try to avoid physical data warehousing altogether and address decision support issues by deriving business intelligence (BI) directly from operational, transactional systems. The latter is sometimes described as a "virtual data warehouse." In the so-called virtual data warehouse, there is no persisting physical implementation. Data is repeatedly transformed instead of being transformed and stored persistently. This has sometimes resulted in performance issues that provide at least a partial explanation of the low turnout for virtual data warehousing. These survey results suggest that virtual data warehousing is not gaining traction (with less than two percent of respondents reporting virtual warehouses as their architecture).

The hub-and-spoke approach is not the only approach, but it is no accident that it is popular. The number of times the data must be transformed is optimal for a majority of scenarios involving many- to-many nodes in a network of source and target data stores. The critical path to success lies through the design and implementation of unified and consistent data dimensions relating to products, customers, promotions, channels, costs, revenues and other dimensions important to a given business. The recommendation is that the single most important action a business can take from an architectural point of view is to design consistent and unified definitions of product, customer, channel, etc. It can then implement either federated data marts or a centralized ODS or some combination of the two. This also enables management of diverse online analytical processing applications as dependent data marts rather than disconnected and dysfunctional silos. The data stores will interoperate, and the design will be sufficiently robust to support a flexible architecture that will accommodate business requirements that business managers cannot necessarily foresee today.


For more information on related topics visit the following related portals...
DW Design, Methodology.

Lou Agosta is the lead industry analyst at Forrester Research, Inc. in data warehousing, data quality and predictive analytics (data mining), and the author of The Essential Guide to Data Warehousing (Prentice Hall PTR, 2000). Please send comments or questions to lagosta@acm.org.



Solutions Marketplace
Provided by IndustryBrains

Bowne Global Solutions: Language Services
World's largest language services firm offers translation/localization, interpretation, and tech writing. With offices in 24 countries and more than 2,000 staff, we go beyond words with an in depth understanding of your business and target markets

Award-Winning Database Administration Tools
Embarcadero Technologies Offers a Full Suite of Powerful Software Tools for Designing, Optimizing, Securing, Migrating, and Managing Enterprise Databases. Come See Why 97 of the Fortune 100 Depend on Embarcadero!

Online Backup and Recovery for Business Servers
Fully managed online backup and recovery service for business servers. Backs up data to a secure offsite facility, making it immediately available for recovery 24x7x365. 30-day trial.

Data Mining: Strategy, Methods & Practice
Learn how experts build and deploy predictive models by attending The Modeling Agency's vendor-neutral courses. Leverage valuable information hidden within your data through predictive analytics. Click through to view upcoming events.

Test Drive the Standard in Data Protection
Double-Take is more affordable than synchronous mirroring and enables you to recover from an outage more quickly than tape backup. Based upon the Northeast blackout and the west coast wild fires, can you afford to be without it?

Click here to advertise in this space

View Full Issue View Full Magazine Issue
E-mail This Column E-Mail This Column
Printer Friendly Version Printer-Friendly Version
Related Content Related Content
Request Reprints Request Reprints
Site Map Terms of Use Privacy Policy

Thomson Media

2005 The Thomson Corporation and DMReview.com. All rights reserved.
Use, duplication, or sale of this service, or data contained herein, is strictly prohibited.