||View Job Listings|
||Post a job|
Data Warehousing Lessons Learned:
Data Warehousing Refresh Rates
The optimal refresh frequency for a data warehouse depends on the industry, the application, the business process, the time horizon of the business process and the underlying technical infrastructure. In particular, the business process is decisive - if I have a three-week demand-planning supply chain, the refresh rate will be different than if I have a customer on the phone. Also, the "optimal" frequency is not necessarily the "standard" or "most common" frequency. (People who respond to surveys are reluctant to admit what they are doing is less than optimal - even if it is). For example, if yesterday's sales data is captured by the automated system at a point-of-sales terminal, then it is reasonable to request to see yesterday's sales data today. On the other hand, if the sale is not really booked until the invoice is paid, then the same request is less reasonable. In that situation, one would reasonably expect to see invoices that have been paid yesterday reported as completed sales today.
Figure 1: Data Warehouse Refresh Rates
|Data Warehouse Refresh Rates||Currently||In 18 Months|
|Many times a day||2%||14%|
|Near real time||0% ||10% |
|Source: The survey was conducted at the TDWI World Conference in New Orleans, Feb. 9-15, 2003. The Quarterly Technology Survey is administered by The Data Warehousing Institute and Giga Information Group. |
© 2003 Giga Information Group, Inc., a wholly owned subsidiary of Forrester Research, Inc. and The Data Warehousing Institute. All rights reserved. Reproduction or redistribution in any form without the prior permission is expressly prohibited
The question "What is the optimal (standard) refresh rate for production data warehouses?" is one that requires a quantitative answer. We have teamed up with our colleagues at The Data Warehousing Institute (TDWI) to provide an answer, and that answer is "daily." Daily is reported as the most common refresh rate for data warehouses by participants at the February 2003 TDWI Conference in New Orleans. According to the survey, near real-time data warehousing is barely on the radar at all, with only two percent reporting multiple updates per day. As indicated, the vast majority of respondents update the data warehouse daily (75 percent), with many also performing monthly (41 percent) and weekly (26 percent) updates. (Note that multiple responses were allowed and some enterprises report using all three refresh rates.) However, the number of survey respondents who expect to perform multiple, daily updates to the data warehouse (or near real-time data warehousing) grows from not quite two percent today to more than 24 percent in 18 months. It is true that enterprises do not always perform as anticipated, but it is still likely to be an accurate expression of a business requirement. Under any interpretation, that is significant expected growth, albeit from a modest base. The possibilities of vendor hype are significant, and it is important for enterprises to appreciate the complexities and trade-offs in undertaking near real-time processing. The zero-latency data warehouse sometimes also requires the zero-latency business enterprise. For example, the product properly scheduled by the 128-way massively parallel processor may be on the loading dock on time, but the truck that will transport the macaroni and cheese product to the customer may be stuck in traffic. It is very important to let the need for reduced latency in the business process itself drive the acquisition and development of the technology. For example, if the customer is on the phone, a real-time recommendation makes sense. However, if a product supply chain is two weeks long, knowing what products are selling on a minute-by-minute basis is probably overkill. An overnight batch run will be less expensive and result in replenishment in ample time. Savvy IT organizations will get ready for real-time data warehousing (and related functions such as data quality), but continue to trade off cost and complexity with reduced latency to find the optimal price/performance for their own enterprise's requirements.
For more information on related topics visit the following related portals...
DW Administration, Mgmt., Performance,
DW Design, Methodology and
Lou Agosta, Ph.D., joined IBM WorldWide Business Intelligence Solutions in August 2005 as a BI strategist focusing on competitive dynamics. He is a former industry analyst with Giga Information Group, has served as an enterprise consultant with Greenbrier & Russel and has worked in the trenches as a database administrator in prior careers. His book The Essential Guide to Data Warehousing is published by Prentice Hall. Agosta may be reached at LoAgosta@us.ibm.com.
Provided by IndustryBrains
|Design Databases with ER/Studio: Free Trial|
ER/Studio delivers next-generation data modeling. Multiple, distinct physical models based on a single logical model give you the tools you need to manage complex database environments and critical metadata in an intuitive user interface.
|Recover SQL Server or Exchange in minutes|
FREE WHITE PAPER. Recover SQL Server, Exchange or NTFS data within minutes with TimeSpring?s continuous data protection (CDP) software. No protection gaps, no scheduling requirements, no backup related slowdowns and no backup windows to manage.
|Speed Databases 2500% - World's Fastest Storage|
Faster databases support more concurrent users and handle more simultaneous transactions. Register for FREE whitepaper, Increase Application Performance With Solid State Disk. Texas Memory Systems - makers of the World's Fastest Storage
|Data Mining: Levels I, II & III|
Learn how experts build and deploy predictive models by attending The Modeling Agency's vendor-neutral courses. Leverage valuable information hidden within your data through predictive analytics. Click through to view upcoming events.
|Free EII Buyer's Guide|
Understand EII - Trends. Tech. Apps. Calculate ROI. Download Now.
|Click here to advertise in this space|