Portals eNewsletters Web Seminars dataWarehouse.com DM Review Magazine
DM Review | Covering Business Intelligence, Integration & Analytics
   Covering Business Intelligence, Integration & Analytics Advanced Search

View all Portals

Scheduled Events

White Paper Library
Research Papers

View Job Listings
Post a job


DM Review Home
Current Magazine Issue
Magazine Archives
Online Columnists
Ask the Experts
Industry News
Search DM Review

Buyer's Guide
Industry Events Calendar
Monthly Product Guides
Software Demo Lab
Vendor Listings

About Us
Press Releases
Advertising/Media Kit
Magazine Subscriptions
Editorial Calendar
Contact Us
Customer Service

Volume Analytics:
Get Ready for the Process Warehouse

online columnist Guy Creese     Column published in DMReview.com
April 15, 2005
  By Guy Creese

Data warehouses have now been with us for over a decade and will be with us for many years to come. However, with virtually all business data now created electronically and the price of storage dropping like a rock, a new type of business intelligence repository is starting to appear - the process warehouse (PW).

As we all know, a data warehouse is a data repository built from the get-go for business intelligence. The seminal idea was to pull important data from operational systems, transform it and put it into a database tuned for queries. The benefits would be several - transaction systems wouldn't take a performance hit due to end-user queries, and the business would have a well-ordered history which would give it guidance for the future. The initial idea has been refined over the years - that data warehouses should be smaller, or geared toward a single subject (data marts); that corporate-wide data models are essential; that the refreshing of data warehouses should be increased to make them more real time. The list goes on.

But one area that has received little refinement is the type of data that should be stored. As a general rule, data warehouses store end counts, rather than process checkpoints - for example, total units shipped in a month, rather than a unit (identified by serial number) tracked through the milestones of assembly, QA, packaging and distribution. Data warehouses have stored the corporation's end result, largely due to regulation and the fact that the end result is easier to pin down. The SEC wants to know a company's final revenue and net income, not its internal workings. In addition, getting internal checkpoint numbers into the data warehouse was usually difficult, if not impossible. For example, a lot of manufacturing companies knew when a product kit was picked and when it turned up in the finished goods warehouse, but couldn't always figure out where the "being created" product was as it moved through the factory.

The Process Warehouse - Storing Process Checkpoints

Put simply, there were a lot of digital data "dead spots" within business. However, that is changing. Compared to a decade ago - when typewritten memos resided in file cabinets, not every company had e-mail and PCs were still relatively expensive - virtually everything is either digitally created or tracked. Memos are written in Microsoft Word; messages are sent via e-mail; workflow systems manage processes; bar code readers and point-of-sale terminals track goods and sales. RFID tags are making it even easier for companies to track goods throughout the supply chain.

In short, tracking data for virtually every business process is now available, and - just as importantly - companies can now afford to store it. Hitachi Global Storage Technologies just announced that it was testing 1TB disk drives for desktop PCs. These technological changes mean that companies can now store process checkpoints in a repository and analyze them to better understand and streamline the business. Rather than being satisfied that they got the product out the door, companies are now asking, "It took us 15 days to build that widget. Why?"

An Example: Web Analytics

An example might make the difference between a data warehouse and a process warehouse clearer. Web analytics is the discipline of using clickstream data to analyze Web site visitor behavior and value. Virtually every visitor's click on a Web site can be tracked; Web analytics takes those clicks and generates metrics such as visit length, the visitor's path through the site and customer lifetime value.

Five years ago, although the individual clicks were initially captured, they were often thrown away after being summarized in a data warehouse. Disk space was expensive and once again, the end counts - total clicks, total visits - were what enterprises wanted. Forward-looking companies stored clickstream history for a year; most companies (especially those in the midst of dot.com mania) heaved it after six months.

However, that practice has changed as time has gone on. Summary data is being stored longer, and individual transaction data is now being stored, rather than being deleted. E-commerce companies recognized the seasonal behavior of online customers and began storing years of data to gain better insight into future behavior. Over time, companies realized that individual visitors had been coming to their sites for years, and the thought of chopping off that visitor's history due to archiving rules became unacceptable, especially with the continued drop in storage prices.

At this point, online savvy companies worry more about checkpoints than they do about counts. The race is on to figure out if shrinking the checkout process by one step will increase sales (generally, yes, but not always). Enterprises have figured out that if they improve the process - e.g., make the site easier to navigate, easier to buy from - the counts (total sales, dollar value per customer) will increase.

The PW - It's Going to Arrive Sooner or Later

Admittedly, Web analytics is a bit of a special case. The uniformity of the clickstream data makes it easier to analyze process checkpoints in the online world than the sometimes balkanized offline world. However, as more internal business processes are tracked, whether by workflow systems, barcode scanners, RFID tags or whatever, businesses will increasingly demand the ability to measure and optimize processes, no matter where they reside. And, since optimization of one step sometimes de-tunes something else, companies will want to check that they're doing better or worse - overall - than they were six months ago. A process warehouse may not yet be appropriate for your specific situation, but get ready. The PW is coming.


For more information on related topics visit the following related portals...
Business Process Management (BPM) and Web Analytics.

Guy Creese is an analyst with the Burton Group, covering content management and search. Creese has worked in the high tech industry for 25 years, at both Fortune 500 companies and small startups, in positions ranging from programmer to product manager to customer support engineer.  He can be reached at gcreese@burtongroup.com.

E-mail This Column E-Mail This Column
Printer Friendly Version Printer-Friendly Version
Related Content Related Content
Request Reprints Request Reprints
Site Map Terms of Use Privacy Policy
SourceMedia (c) 2006 DM Review and SourceMedia, Inc. All rights reserved.
SourceMedia is an Investcorp company.
Use, duplication, or sale of this service, or data contained herein, is strictly prohibited.