Portals eNewsletters Web Seminars dataWarehouse.com DM Review Magazine
DM Review | Covering Business Intelligence, Integration & Analytics
   Covering Business Intelligence, Integration & Analytics Advanced Search

Resource Portals
Analytic Applications
Business Intelligence
Business Performance Management
Data Integration
Data Quality
Data Warehousing Basics
More Portals...


Information Center
DM Review Home
Conference & Expo
Web Seminars & Archives
Current Magazine Issue
Magazine Archives
Online Columnists
Ask the Experts
Industry News
Search DM Review

General Resources
Industry Events Calendar
Vendor Listings
White Paper Library
Software Demo Lab
Monthly Product Guides
Buyer's Guide

General Resources
About Us
Press Releases
Media Kit
Magazine Subscriptions
Editorial Calendar
Contact Us
Customer Service

Data Warehousing Lessons Learned:
Relentless Improvements in Hardware Dwarf the Open Source Revolution

  Column published in DM Review Magazine
March 2004 Issue
  By Lou Agosta

The changing economics of data warehousing hardware environments mean more options for end-user enterprises as the Linux and AMD64 revolution comes to data warehousing. This is exemplified by improving TPC-H benchmarks and IBM's Integrated Cluster Environment (ICE). In addition to a marketing pun as IBM puts the data warehouse on ICE, the metrics make an engaging case study. A careful inspection of the numbers indicates that the open source revolution is the occasion for the price reductions, not the cause. Approximately 96 percent of the savings is due to hardware improvements as well as lower database costs directly determined by hardware improvements. The actual savings due to open source is one percent of the total system cost. This is the first audited benchmark to be submitted using SUSE's LINUX operating system with any standard relational database (DB2 UDB 8.1 in this case). The results amount to lower costs and higher performance as is typical of the relentless march of progress in commoditizing a successful technology. Data warehousing clients should look to open source for savings in acquisition and lifetime support costs but should not neglect the relentless march of improved hardware performance as a source of savings of even greater current significance.

Let's compare the IBM TPC-H from July 29, 2003, with that from April 9, 2002, at the 300GB volume point. Both execute with DB2 UDB - versions 8.1 and 7.2, respectively. The overall price of the configuration has fallen dramatically in the past 15 months, from $2,636,750 to $851,953 (by $1,784,797 or nearly 68 percent). Meanwhile, the composite power and throughput metric (QphH@300GB) remained about the same, increasing slightly from 12,995.4 to 13,194.9 (a tad more than one percent). As noted, this betters the definition of Moore's law (which is still in force and states that processor performance doubles every 18 months) because the IBM eServer with 2GHz AMD chips from the July 2003 submission cost $112,935 in comparison with ProLiant 900MHz chips priced at $777,812 from the April 2002 report. This creates an 85-percent improvement in the price of the hardware during 15 months (see Figure 1).

Figure 1: Year-to-Year IBM TPC-H Improvement

The increased hardware power results in a reduction of the number of processors from 64 in the April 2002 benchmark to 16 processors in the July 2003 benchmark. Given a clustering configuration where the database is priced by processor, this results in a reduction of the cost of the database software from $1,417,792 for the April 2002 benchmark to $425,504 (including the DPF license) for the July 2003 benchmark, a savings of $992,288 (with DPF).

This results in a total savings of $2,777,085 for both the hardware and database, of which the database is approximately 36 percent and the hardware is 44 percent. However, note the database savings is directly determined by the hardware savings. In addition, the price of the database per instance decreased approximately 10- percent year to year. While every penny is significant, the latter is not a major factor here and, along with modest improvement in the cost of disk, is responsible for the other 3 percent not included in the bottom row of the table in Figure 1.

Meanwhile, regarding the 300GB benchmarks, the actual cost of the operating system is $38,384 for Microsoft Windows 2000 Advanced Server versus $2,588 for SUSE. This is a dramatic percentage savings of approximately 93 percent of Linux over Microsoft, or $35,796. However, this savings is off of a very modest base. This savings is thus only approximately 2 percent of the total system cost of the July 29, 2003, benchmark and only one percent of the April 2002 submission. Therefore, the open source operating system is only a small part of the overall dynamic here. While every dollar counts, there are just so many more of them in the case of the hardware and database.

As indicated, the cost of the processor hardware and the implied savings in the reduced number of database instances is responsible for the lion's share of the savings. It is quite likely that dramatic savings would still be available even if open source were not a part of the equation. Open source has numerous benefits including breaking the relentless hold of technology lock-in and cost savings in acquisition and lifetime support costs. However, clients should also attend to significant opportunities for savings due to such common considerations as the natural trajectory of technology innovation, which generates improved hardware performance as a source of savings of great measure in its own right.


For more information on related topics visit the following related portals...

Lou Agosta is the lead industry analyst at Forrester Research, Inc. in data warehousing, data quality and predictive analytics (data mining), and the author of The Essential Guide to Data Warehousing (Prentice Hall PTR, 2000). Please send comments or questions to lagosta@acm.org.



View Full Issue View Full Magazine Issue
E-mail This Column E-Mail This Column
Printer Friendly Version Printer-Friendly Version
Related Content Related Content
Request Reprints Request Reprints
Site Map Terms of Use Privacy Policy
SourceMedia (c) 2005 DM Review and SourceMedia, Inc. All rights reserved.
Use, duplication, or sale of this service, or data contained herein, is strictly prohibited.