Portals eNewsletters Web Seminars dataWarehouse.com DM Review Magazine
DM Review | Covering Business Intelligence, Integration & Analytics
   Covering Business Intelligence, Integration & Analytics Advanced Search
advertisement

RESOURCE PORTALS
View all Portals

WEB SEMINARS
Scheduled Events

RESEARCH VAULT
White Paper Library
Research Papers

CAREERZONE
View Job Listings
Post a job

Advertisement

INFORMATION CENTER
DM Review Home
Newsletters
Current Magazine Issue
Magazine Archives
Online Columnists
Ask the Experts
Industry News
Search DM Review

GENERAL RESOURCES
Bookstore
Buyer's Guide
Glossary
Industry Events Calendar
Monthly Product Guides
Software Demo Lab
Vendor Listings

DM REVIEW
About Us
Press Releases
Awards
Advertising/Media Kit
Reprints
Magazine Subscriptions
Editorial Calendar
Contact Us
Customer Service

Data Warehousing Lessons Learned:
Data Warehousing Refresh Rates

  Column published in DM Review Magazine
June 2003 Issue
 
  By Lou Agosta

The optimal refresh frequency for a data warehouse depends on the industry, the application, the business process, the time horizon of the business process and the underlying technical infrastructure. In particular, the business process is decisive - if I have a three-week demand-planning supply chain, the refresh rate will be different than if I have a customer on the phone. Also, the "optimal" frequency is not necessarily the "standard" or "most common" frequency. (People who respond to surveys are reluctant to admit what they are doing is less than optimal - even if it is). For example, if yesterday's sales data is captured by the automated system at a point-of-sales terminal, then it is reasonable to request to see yesterday's sales data today. On the other hand, if the sale is not really booked until the invoice is paid, then the same request is less reasonable. In that situation, one would reasonably expect to see invoices that have been paid yesterday reported as completed sales today.

Data Warehouse Refresh RatesCurrentlyIn 18 Months
Monthly41%27%
Weekly26%29%
Daily75%72%
Many times a day2%14%
Near real time0% 10%
Source: The survey was conducted at the TDWI World Conference in New Orleans, Feb. 9-15, 2003. The Quarterly Technology Survey is administered by The Data Warehousing Institute and Giga Information Group.
© 2003 Giga Information Group, Inc., a wholly owned subsidiary of Forrester Research, Inc. and The Data Warehousing Institute. All rights reserved. Reproduction or redistribution in any form without the prior permission is expressly prohibited
Figure 1: Data Warehouse Refresh Rates

The question "What is the optimal (standard) refresh rate for production data warehouses?" is one that requires a quantitative answer. We have teamed up with our colleagues at The Data Warehousing Institute (TDWI) to provide an answer, and that answer is "daily." Daily is reported as the most common refresh rate for data warehouses by participants at the February 2003 TDWI Conference in New Orleans. According to the survey, near real-time data warehousing is barely on the radar at all, with only two percent reporting multiple updates per day. As indicated, the vast majority of respondents update the data warehouse daily (75 percent), with many also performing monthly (41 percent) and weekly (26 percent) updates. (Note that multiple responses were allowed and some enterprises report using all three refresh rates.) However, the number of survey respondents who expect to perform multiple, daily updates to the data warehouse (or near real-time data warehousing) grows from not quite two percent today to more than 24 percent in 18 months. It is true that enterprises do not always perform as anticipated, but it is still likely to be an accurate expression of a business requirement. Under any interpretation, that is significant expected growth, albeit from a modest base. The possibilities of vendor hype are significant, and it is important for enterprises to appreciate the complexities and trade-offs in undertaking near real-time processing. The zero-latency data warehouse sometimes also requires the zero-latency business enterprise. For example, the product properly scheduled by the 128-way massively parallel processor may be on the loading dock on time, but the truck that will transport the macaroni and cheese product to the customer may be stuck in traffic. It is very important to let the need for reduced latency in the business process itself drive the acquisition and development of the technology. For example, if the customer is on the phone, a real-time recommendation makes sense. However, if a product supply chain is two weeks long, knowing what products are selling on a minute-by-minute basis is probably overkill. An overnight batch run will be less expensive and result in replenishment in ample time. Savvy IT organizations will get ready for real-time data warehousing (and related functions such as data quality), but continue to trade off cost and complexity with reduced latency to find the optimal price/performance for their own enterprise's requirements.

 

...............................................................................

For more information on related topics visit the following related portals...
DW Administration, Mgmt., Performance, DW Design, Methodology and DW Basics.

Lou Agosta, Ph.D., joined IBM WorldWide Business Intelligence Solutions in August 2005 as a BI strategist focusing on competitive dynamics. He is a former industry analyst with Giga Information Group, has served as an enterprise consultant with Greenbrier & Russel and has worked in the trenches as a database administrator in prior careers. His book The Essential Guide to Data Warehousing is published by Prentice Hall. Agosta may be reached at LoAgosta@us.ibm.com.

Solutions Marketplace
Provided by IndustryBrains

Design Databases with ER/Studio: Free Trial
ER/Studio delivers next-generation data modeling. Multiple, distinct physical models based on a single logical model give you the tools you need to manage complex database environments and critical metadata in an intuitive user interface.

Recover SQL Server or Exchange in minutes
FREE WHITE PAPER. Recover SQL Server, Exchange or NTFS data within minutes with TimeSpring?s continuous data protection (CDP) software. No protection gaps, no scheduling requirements, no backup related slowdowns and no backup windows to manage.

Speed Databases 2500% - World's Fastest Storage
Faster databases support more concurrent users and handle more simultaneous transactions. Register for FREE whitepaper, Increase Application Performance With Solid State Disk. Texas Memory Systems - makers of the World's Fastest Storage

Data Mining: Levels I, II & III
Learn how experts build and deploy predictive models by attending The Modeling Agency's vendor-neutral courses. Leverage valuable information hidden within your data through predictive analytics. Click through to view upcoming events.

Free EII Buyer's Guide
Understand EII - Trends. Tech. Apps. Calculate ROI. Download Now.

Click here to advertise in this space


View Full Issue View Full Magazine Issue
E-mail This Column E-Mail This Column
Printer Friendly Version Printer-Friendly Version
Related Content Related Content
Request Reprints Request Reprints
Advertisement
advertisement
Site Map Terms of Use Privacy Policy
SourceMedia (c) 2006 DM Review and SourceMedia, Inc. All rights reserved.
SourceMedia is an Investcorp company.
Use, duplication, or sale of this service, or data contained herein, is strictly prohibited.