-
Marketplace
-
Channel Resources
Articles from this Site
Navy Exchange Service Command Selects Netezza
Netezza Enters Location Intelligence Market
Data Warehousing Meets Data Archiving in Information Lifecycle Management
Data Warehousing Appliances Fly into a Storm of Controversy
IBM Launches Next Generation of Business Intelligence with Dynamic Warehousing
White Papers
Data Warehouses: What are they and how will they benefit your organization?
Data Warehousing Buyer's Guide
Books
Introducing the Data Warehouse Appliance, Part 2
Building Business Intelligence
The following column is excerpted from the white paper, "Introducing the Data Warehouse Appliance," by William McKnight.
data warehouse appliance n., 1: a hardware/software/OS/DBMS/storage bundle designed to perform traditional and complex analysis functions using commodity components at a price/performance advantage over traditional approaches.
Datallegro and Netezza are examples of data warehouse appliance vendors. As such, they offer pre-integrated platforms, storage, relational database management systems (RDBMSs) and their own software to make it all work together according to their specifications, but that doesn't mean their configurations are identical.
Datallegro uses Novell's SUSE Linux open source OS software. Datallegro uses Ingres as its open source RDBMS. Netezza also leverages Linux open source OS but uses the version provided by Red Hat. Netezza uses Postgres as its open source RDBMS. Netezza uses Gigabit Ethernet, and Datallegro uses InfiniBand.
Where they differ is in their architectural approaches. Datallegro configures off-the-shelf components into dual-CPU, multi-disk "bricks" as their unit of parallelism. Datallegro says this architecture delivers balanced performance for general purpose data warehousing (i.e., mixed query workload) by marrying the power of dual CPUs with very high direct-attach I/O capacity. They further claim their data distribution significantly reduces network traffic on joins.
Netezza's unit of parallelism is their Snippet Processing Unit (SPU). The SPU consists of a disk drive and a special-purpose computer with hard-wired logic for accelerating record management and analysis. According to a recent Forbes article, "The chip queries the data right at the drive, passing back only the correct answers to the main computer, which runs Netezza's own database software program. The machine runs faster because fewer files are flying back and forth." (Forbes, December 13, 2004).
The vendors also differ in their product positioning. Datallegro positions itself as a general-purpose bolt-on to terabyte-and-beyond Oracle data warehouse environments, whereas Netezza is targeting high-end enterprise data warehouse environments.
One important characteristic the data warehouse appliance market shares is that it is taking a fresh look at an old problem. By challenging conventional price points for the storage of complete corporate data and the development cycles for the data to be accessible and under management, they are hoping to render useless entrenched views. This is one example of many new approaches and mind-set changes that the appliance model brings to a company deploying it.
Some hurdles have already been crossed by the data warehouse appliance industry. Data load rates are quite impressive. Performance of selective queries, especially against large volumes of data, is distinctively impressive due to the automatic parallelism. It is difficult to validate low TCO for a mixed workload data warehouse environment at this time, but low TCO is seemingly consequential with appliances.
Unproven areas include highly concurrent environments, management tools (for those times when you do need to tune the system), vendor support (although SQL, ODBC and JDBC compliance are supported) and named reference accounts. However, most of these are issues of maturity, not inherent flaws in the architecture.
Appliances are already solving real-world problems such as a wireless carrier having access to 120 days of data for revenue assurance analysis in less than 30 minutes (versus 6 hours for a single day) and 30 minutes for traffic pattern analysis that previously took 23 hours.
If you're committed to physical data warehousing and have a terabyte-plus warehouse or designs for one, stay aware of data warehouse appliances. Will the market recognize them in time or are they ahead of our time? Will traditional vendors such as Oracle, HP, Teradata and IBM close the gap? These questions remain to be answered.
William McKnight is partner, Information Management, at Lucidity Consulting Group. William functions as strategist, lead enterprise information architect and program manager for complex, high-volume full life-cycle implementations worldwide utilizing the disciplines of data warehousing, master data management, business intelligence, data quality and operational business intelligence. Many of his clients have gone public with their success story. McKnight is a Southwest Entrepreneur of the Year Finalist, a frequent best practices judge, has authored more than 150 articles and white papers and given over 150 international keynotes and public seminars. His teams implementations from both IT and consultant positions have won Best Practices awards. He is a former IT VP of a Fortune company, a former engineer of DB2 at IBM and holds an MBA. He can be reached at wmcknight@luciditycg.com
For more information on related topics, visit the following channels:


