Portals eNewsletters Web Seminars dataWarehouse.com DM Review Magazine
DM Review | Covering Business Intelligence, Integration & Analytics
   Covering Business Intelligence, Integration & Analytics Advanced Search

Resource Portals
Analytic Applications
Business Intelligence
Business Performance Management
Data Integration
Data Quality
Data Warehousing Basics
More Portals...


Information Center
DM Review Home
Conference & Expo
Web Seminars & Archives
Current Magazine Issue
Magazine Archives
Online Columnists
Ask the Experts
Industry News
Search DM Review

General Resources
Industry Events Calendar
Vendor Listings
White Paper Library
Software Demo Lab
Monthly Product Guides
Buyer's Guide

General Resources
About Us
Press Releases
Media Kit
Magazine Subscriptions
Editorial Calendar
Contact Us
Customer Service

Surviving the Perfect Storm in Data Management

  Article published in DM Review Magazine
January 2001 Issue
  By David Clements

The emergence of new data-intensive applications is resulting in the accumulation of huge amounts of digital information. Indeed, the collection of data that analysts so calmly referred to as a "sea of data" just ten years ago has now swollen to tsunami forces. As a result, one might say that today's data management professionals are facing a "perfect storm," and calmer waters are not yet in sight.

The Wave

Most analytical reports today cite exorbitant factors when projecting enterprise data growth over the next five years. For example:

  • Red Herring's March 2000 summary, "The Age of Petabytes," forecasted data growth rates of 75 to 150 percent per year.
  • A META Group analyst who spoke at a May 2000 eCRM conference projected data increases of a hundredfold within five years through the year 2004. Enterprises that are having difficulty coping with three terabytes (TB) of data today need to quickly find solutions for dealing with 300 terabytes of data tomorrow. Since the May 2000 conference, other META representatives have validated this growth factor as well as the urgent need for in-depth strategic data management planning.
  • A recent Deutsche Banc Alex.Brown data study found that e- business data will grow from 30 percent of the total data in the first year of activity (1999, in most cases) to 75 percent of the total data in the fourth year.1 This growth represents a commanding data swell of 400 percent per year.

Figure 1 integrates the data in these three studies to arrive at a seemingly dependable consensus about the rate of data growth. It uses the Red Herring data as the basis of the graph. The triangulated outlook applies to global 2000 enterprises and assumes an average starting point of three terabytes of total in-house data in 1999. The striped sections of the color-coded vertical bars estimate the percentage of growth stimulated and consumed by e-business activities.

Figure 1: Data Growth Projections

Understanding and acceptance of these predictions comes only after you consider the scope of new business initiatives and the technological capabilities that both enable and support them. New e-business applications such as Web-front management (clickstream analysis), one-to-one customer relationship management (CRM), personalization and encounter management, supply chain management, call event detail analysis and digital certification significantly add to an enterprise's existing IT-supported agenda. In addition, new nonscalar data types (objects) including images (drawings, X-rays, etc.), streaming audio and video dramatically expand the data inventory.

Surging TCO Costs

It's easy to suppose that somehow more disk storage and technological advances will ease the cost of managing this tidal wave of new data. However, as Figure 2 indicates, even with the expected continuation in RAID price/ terabyte decline, the total expenditure necessary to accommodate the projected data growth will escalate more than tenfold in the next five years.

To be consistent with the data in Figure 1, Figure 2 uses the following numbers:

  • A 1999 total storage starting point of three terabytes of serviced data, which grows at a rate of 150 percent per year.
  • $300K/TB as the starting cost of disk storage.
  • A decline in storage costs of 30 percent per year, which is the figure projected by most analysts, including IDC, Gartner and META.
  • An expenditure calculation that uses a total cost of ownership (TCO) composite, which takes into account the hardware price/TB plus overhead factors for storage and data management tools and services.

The message expressed in Figure 2 is startling. Over the next five years, given a sixfold decrease in price/TB and a hundredfold increase in data, you can expect a thirteenfold increase in total data management costs.

Figure 2: Storage Pricing and Expenditures

Finding New Harbors

Most IT professionals would agree that these soaring expenditures are unacceptable. Although the promise of Web-enabled business applications offers greater profitability, conscientious CIOs must look for, consider and embrace more cost- effective data management strategies and alternatives.

One of the first things to consider (or reconsider) is data management treatment. How do you use and effectively manage different types of data? Figure 3 classifies data management treatment into a framework with four distinct categories of information processing. Each category requires different software and technology.

Figure 3:Data Management Framework

First, there's active versus supportive data. You must keep active data fresh and current because it's used by operational procedures. In contrast, you may derive supportive data and refresh it periodically (for example, nightly or weekly). Informational and analytical applications use supportive data.

Next, there's high-concurrency versus high-volume storage. High-concurrency storage typically consists of rotating disk memories that service high amounts of simultaneous access to small amounts of data, measured in transactions per second (TPS). High-volume storage typically deploys multiple media (disk, optical and tape) and services small amounts of simultaneous requests for large volumes of data, measured in gigabytes per minute (GPM).

In most cases, the majority of enterprise data does not need to be maintained in an active profile on high-concurrency storage media. As Figure 4 indicates, as little as 15 percent of the total data resource may be all that is required, depending on applications.

Figure 4: Data Allocation Framework

Aging details such as clickstream logs, order entry line items, call event detail records, inactive account descriptions, audit summary backups and many other file segments that do require subsecond retrieval rates can be allocated to more cost-effective, high-volume storage.

Charting the Course

Analytics have always been key to successful business endeavors. However, in the e-business arena, the time element for many applications is so critical that enterprises can no longer afford the luxury of offline decision making. Many of the manual processes that previously depended on decision support systems (DSSs) must now be automated.

META Group's infrastructure recommendations at the May 2000 eCRM conference further endorse the active versus supportive, high-concurrency versus high-volume storage framework. For example, META suggests that e-CRM systems should be bolstered with two levels of analytics: real time (active) and batch (supportive). META refers to this split as micro- and macro- analytics, respectively (refer to Figure 5).

Figure 5: Two- Tier Analytics

Within the batch portion of the storage framework, long- running macro data mining applications culture attributes and triggers that are posted in an account master record in an online operational data store (ODS). In turn, the real-time micro-analytics run on top of the ODS like a state machine and react in accordance with the derived triggers and attributes. For example, a reaction to a stored trigger might be a notification message about new product offerings that parallel a customer's previous buying trends. This delegation of process has two areas of positive impact. Shifting the macro- analytics to the supportive/high-volume data management environment improves the real-time performance factor. Furthermore, transferring the high-volume detail data to a more economical storage platform realizes tremendous cost savings.

Calmer Waters

Figure 6 depicts a vastly improved cost projection based on using the proposed high- volume/high-concurrency data allocation framework and the two-tiered analytical model. These projections include the integration of an alternative data management solution. This type of solution satisfies the high-volume requirements of the proposed data management framework in Figure 3 by providing data recovery and deep volumes of atomic detail data for supportive analytics, all in one cohesive strategy.

This alternative management solution accommodates high-volume needs with RAID media and cost-effective robotic storage (automated seek-and-play elements that include optical disk jukeboxes and high-speed automated cartridge tape libraries). It also provides sophisticated storage management and relational database management software. This software automates critical system management tasks such as data migration, backup and recovery as well as row/record level selectivity regardless of database size or location of data within the storage hierarchy.

In Figure 6, the broken line represents the TCO mapping initially described in Figure 2. Contrast this line with the actual cost of storing high-volume data on a more appropriate platform. As you can see, by taking the same volume of data - but now delegating it to the proposed subsystem alternatives - you can achieve tremendous cost savings.

Figure 6: An Economic Alternative

The example in Figure 6 keeps 30 percent of the data on more costly high-response RAID technology and migrates 70 percent of the data to a high-volume alternative technology.

Although this scenario is somewhat conservative, it represents a savings of more than $160 million over the charted five-year period. Every additional one percent of data shifted from high-concurrency storage to alternative storage by year 2004 will represent even more savings.

Surviving the Storm

Much like nature's "perfect storm," several forces in the IT world are converging to form a magnitude of data management problems that transcend previous levels of experience. Both the supply and demand sides of the business information equation are escalating together at a whirlwind pace. New-generation e- applications are producing torrents of data at a predicted hundredfold, five- year growth rate. At the same time, enterprise managers continue to thirst for more insights that can only be gained from analyzing these massive amounts of accurate and timely detail data. Add to this turbulence an overwhelming projection in the associated cost to contain and manage these torrents of data, and it becomes apparent that new alternatives need to be considered.

1 Dolan, Timothy J., C.F.A. "eCRM: The Difference Between Winners and Losers in the e-Business World of the 21st Century." Deutsche Banc Alex.Brown. North American Equity Research/US Enterprise Software. September 15, 1999.


For more information on related topics visit the following related portals...
DW Administration, Mgmt., Performance, Business Intelligence, Data Management and Data Analysis.

David Clements is vice president of marketing for eBusiness Programs, FileTek's data management service provider division. Clements has more than thirty years of experience in database and information technologies. He can be reached at dclements@filetek.com.

Solutions Marketplace
Provided by IndustryBrains

Bowne Global Solutions: Language Services
World's largest language services firm offers translation/localization, interpretation, and tech writing. With offices in 24 countries and more than 2,000 staff, we go beyond words with an in depth understanding of your business and target markets

Award-Winning Database Administration Tools
Embarcadero Technologies Offers a Full Suite of Powerful Software Tools for Designing, Optimizing, Securing, Migrating, and Managing Enterprise Databases. Come See Why 97 of the Fortune 100 Depend on Embarcadero!

Online Backup and Recovery for Business Servers
Fully managed online backup and recovery service for business servers. Backs up data to a secure offsite facility, making it immediately available for recovery 24x7x365. 30-day trial.

NEW Glasshouse White Paper from ADIC
Learn to integrate disk into your backup system; evaluate real benefits and costs of different disk backup approaches; choose between disk arrays and virtual tape libraries; and build long-term disaster recovery protection into a disk backup system.

Data Mining: Strategy, Methods & Practice
Learn how experts build and deploy predictive models by attending The Modeling Agency's vendor-neutral courses. Leverage valuable information hidden within your data through predictive analytics. Click through to view upcoming events.

Click here to advertise in this space

View Full Issue View Full Magazine Issue
E-mail This Article E-Mail This Article
Printer Friendly Version Printer-Friendly Version
Related Content Related Content
Request Reprints Request Reprints
Site Map Terms of Use Privacy Policy
SourceMedia (c) 2005 DM Review and SourceMedia, Inc. All rights reserved.
Use, duplication, or sale of this service, or data contained herein, is strictly prohibited.