FREE DM Review Site Registration!
Sign-up today and access DM Review on the Web!

Your FREE registration entitles you to:

FREE email newsletters

FREE access to all DM Review content

FREE access to web seminars, resource portals, our white paper library and more!

   

The Data Warehousing Satisfaction Survey, Part 2: The Bounds of Data Warehousing Limited only by the Business Imagination

The IBM Data Warehousing Satisfaction Survey (2007) consisted of an invitation to some 200 end-user enterprises to participate in an anonymous, Web-based survey about data warehousing architecture, latency, size and related business issues. Invitations were sent to enterprises regardless of the data warehousing platforms they were using, and respondents included the complete spectrum of what is in the market at this time, including IBM, Microsoft, Netezza, Oracle and Teradata platforms. The emphasis was on surfacing trends that apply regardless of the specific data warehousing platform. Here is a look at some of the initial results of the survey.1

Part 1 appeared in DM Direct Special Report on October 2, 2007.

A Single Fact is Worth a Thousand Opinions

Much good news is available about data warehousing systems according to those responding to IBM's satisfaction survey. At the top of the list (see Table 1) with about 73 percent responding, data warehousing is useful in making business decisions and guiding business operations. Data warehouses provide visibility to trends in the business that require watching, and they provide a single view of the business, the famous "single source of truth." Some 56 percent of respondents report that their data warehouse delivers quantifiable business value. The question was worded to require accountability. The objectives (the "success criteria") were to deliver quantifiable business value. This makes explicit that the calculation of business value is provided by the responding enterprise. A single fact is worth a thousand opinions when it comes to decision making and data warehousing provides the facts.

Table 1 (Percentage will sum to more than 100 percent due to allowing multiple responses.)

In contrast to the positive results highlight in Table 2, only 15 percent of respondents report they have a data governance process that works in Table 1. This implies a significant opportunity to make progress in the data warehousing capability maturity model and, frankly, seems at odds with the rosy reporting about quantification of business value from the data warehouse. Still, the data point that 56 percent calculate the value implies that 44 percent do not regard this as a priority. When you think about it, this is a substantial number, especially given the substantial investments required by building and operating a data warehousing system. This indicates a significant area for improvement that will challenge enterprises to raise the bar on the levels of leadership, professionalism and accountability that they bring to the collaboration of the business units and the information technology organizations.

The Bounds of the Data Warehouse are Limited Only by the Business Imagination

Data warehousing provides infrastructure and support for a wide variety of business intelligence and decision support applications. The business processes for which data warehousing acts as an enabler extend from the front office with its customer facing and marketing systems to the back office with cost containment and pricing optimization solutions. Advanced applications in fraud detection, demand planning and pricing optimization are also becoming more common and point to the deployment of second and third generation data warehousing systems. Those enterprises that handle physical merchandise see wide-ranging applications in demand planning, inventory reduction (and savings) and supply chain optimization. The "other" category where enterprises are allowed to report free form on their uses of data warehousing contains an interesting array of solutions. Several compliance applications showed up including "pharmaco-vigilance," credit risk management, operational risk as well as selling data about consumers and consumer trends, tracking royalties, health research, provider networks, merchandising, retail store management, clinical efficacy and quality improvement. Obviously the list can be extended even further. The bounds of data warehousing are limited only by the bounds of the business imagination.

Table 2 (Percentages will sum to more than 100 percent due to allowing multiple responses.)

As shown in Table 2, financial applications are at the top of the list with more than 66 percent of enterprises reporting the use of data warehousing. Companies want to know "How are we doing financially? Are we getting our numbers? What does the pipeline look like?" Other key issues addressed by data warehouses include marketing and sales numbers, which are a superset of CRM and include trend analysis in a variety of contexts including cross-selling. As indicated, these results show that one of the fundamental reasons data warehouses are built and operated is to address the business question: "What customers are buying (or using) what product and service by what channel and time period?" The intersection of the two - customer and product - is the basic transaction in which the customer buys the product, resulting in the revenue on which the financial statements report. If we drill down on the customer dimension, we obtain CRM applications, such as customer service, cross-selling, reducing customer churn ("loyalty"), and fraud detection and reduction. If we drill down on the product dimension, we get a variety of applications  such as demand planning, revenue optimization, market basket analysis, inventory management, logistics and distribution.

Data Warehousing Architecture: Avoiding "Religious Wars"

Data warehousing architectures are almost as diverse and heterogeneous as data warehousing installations themselves. However, for purposes of this survey, we distinguished two - centralized and distributed data warehousing architectures with each of these, in turn, having two forms, conformed or non-conformed. Here "conformed" means aligning with a consistent design that applies a consistent data model to all (or almost all) of the data entities, dimensions or structures. Another word for "conformed" is federated, sharing a common design though not a common database instance. According to Table 3, the results reported are close race between users of a centralized, atomic data store and those with a hub-and-spoke architecture characterized by a persisting, centralized data store and attached (dependent) data marts. It was surprising to the survey team that conformed data marts with no persisting data store was not better represented. This was (and is) Ralph Kimball's contribution, is otherwise known to be widely implemented in certain contexts and is likely underrepresented due to the concentration of survey respondents from large, multiterabyte, high-end enterprises that look to reduce coordination costs through centralized processing. A significant, though minority percentage of respondents (12 percent), acknowledge having non-conformed silos (data marts) only. Indeed the prize for the most humorous free-form response goes to the individual who suggested a song to be sung to the melody of Old MacDonald Had a Farm: "Here a silo, there a silo, everywhere a silo ... Old MacDonald had a data warehouse, etc." Even if conformed data marts are underrepresented due to sampling so many large enterprises, this survey detects a trend toward centralization.

The virtual data warehouse is a novel and innovative approach to architecture, especially if service-level expectations are managed carefully. It is tactically significance in terms of system interoperability, especially as computing power and bandwidth continue to improve. It tends to overlook the basic distinction between query-intensive and transaction-intensive systems. It also invites throwing server hardware upgrades and expensive bandwidth at performance problems as opposed to designing and implementing architecture. Experience suggests it is also a solution that is potentially open to performance bottlenecks and so (as always) an approach that must be carefully qualified in terms of the customer's real-world environment. It is the clear loser in this survey as indicated in Table 3 with no one acknowledging having such architecture.

Table 3 (Percentages less than 100 percent due to rounding.)

In conclusion, firms that are highly centralized in geography and governance should pursue centralized data warehouse architecture to reap the greatest operational efficiencies and business benefits. Firms that are highly decentralized will prefer a distributed architecture and those with a mixed organizational pattern should implement a mixed ("conformed") one. Lack of fit between organization form and architecture has resulted in the political turf battles, social conflicts and technology wars ("religious wars") for which data warehousing is famous. These are avoidable by aligning form and structure. Clearly many of the firms responding to this survey have done that.

It does not make sense for a vendor to publish market share data - where is the objectivity in that? At the same time, the reader does not need a survey to know that data warehousing appliances have created buzz in the market. Thus, this is the place in the discussion to comment on the adoption of data warehousing appliances as a trend. Architecturally these are being positioned as enterprise data marts in a hub-and-spoke configuration with a centralized atomic data warehouse. Yes, there is adoption, though it is not quite as rapid or pervasive as the marketing brochures would tend to indicate. This was not a survey on pricing, but one of the consequences of the adoption of appliances and appliance-like configurations is to increase price competition in multiple-tiers of the market. Obviously, this is to the advantage of enterprises that are looking to buy. As is often the case with advanced information technology, you can tell the pioneers by who has the arrows in their backs. In the case of the appliance, this means that we are at the end of the beginning. Now that the early entrants have validated the concept of the appliance, the large vendors, including IBM, will move to take back the market and do so in such a way that allows business enterprises to participate without risking buying what could soon be an orphan technology if the start up happens to stumble. This is perhaps also the place to note that it is the beginning of the end for legacy data warehousing systems. Going forward, enterprises will need only one kind of standard relational database in order to operate both transactional and business intelligence workloads (obviously on different instances).

Editor's Note:

1. Phase 2 of the IBM Data Warehousing Satisfaction Survey is now live! Interested readers are invited to participate by clicking on the following URL and spending 20 minutes answering some 23 questions in this anonymous, Web-based survey. https://www14.software.ibm.com/iwm/web/swg-dwss/entry.shtml


Lou Agosta is an independent industry analyst in data warehousing. A former industry analyst at Giga Information Group, Agosta has published extensively on industry trends in data warehousing, data mining and data quality. He can be reached at LAgosta@acm.org.ûû

Mario Passalacqua is the director of IBM's Worldwide Business Intelligence Group. Passalacqua has spoken and written extensively on IBM's dynamic data warehousing initiative and related topics. He can be reached at mcpassa@us.ibm.com.  

Brian C. McGoff is a leader in the IBM World Wide Business Intelligence Group. McGoff has written and spoken widely on issues in data warehousing and business intelligence. He can be reached at bmcgoff@us.ibm.com.

For more information on related topics, visit the following channels:



Industry Vendors