Portals eNewsletters Web Seminars dataWarehouse.com DM Review Magazine
DM Review | Covering Business Intelligence, Integration & Analytics
   Covering Business Intelligence, Integration & Analytics Advanced Search
advertisement

RESOURCE PORTALS
View all Portals

WEB SEMINARS
Scheduled Events

RESEARCH VAULT
White Paper Library
Research Papers

CAREERZONE
View Job Listings
Post a job

Advertisement

INFORMATION CENTER
DM Review Home
Newsletters
Current Magazine Issue
Magazine Archives
Online Columnists
Ask the Experts
Industry News
Search DM Review

GENERAL RESOURCES
Bookstore
Buyer's Guide
Glossary
Industry Events Calendar
Monthly Product Guides
Software Demo Lab
Vendor Listings

DM REVIEW
About Us
Press Releases
Awards
Advertising/Media Kit
Reprints
Magazine Subscriptions
Editorial Calendar
Contact Us
Customer Service

Building Business Intelligence:
Achieving BI Query Performance

  Column published in DM Review Magazine
October 2005 Issue
 
  By William McKnight

According to Forrester Research, "...the number and diversity of BI [business intelligence] users will continue increasing steadily throughout the decade, achieving a penetration of between 25 percent and 40 percent of all enterprise users."1 We simply must get our BI environments beyond the "personal touch" that is possible with small user communities. Yet, there are challenges in achieving this vision. Not surprisingly, the biggest challenge is query performance. What is surprising is that many organizations manage their query performance issues by actually limiting user queries to a predefined list or predetermined times!

How is the vendor marketplace responding? It has been an acquisition race to extend functionality and provide end-to-end "one stop" shopping among the elite BI vendors. This includes players such as IBM, Oracle, Microsoft and SAP, among others. Fortunately, one need addressed in these "stacks" is data quality. Performance, however, is still relegated to minor, incremental improvements in the base technologies, such as database management systems. Smaller, less established vendors with important technology are not included as part of these functional stacks if they lack market share. It is hard for these smaller vendors to get noticed when the major vendors provide the one-stop shop. Tracking the array of offerings from just one prime vendor can be bewildering and time-consuming.

The quality of database administration and related disciplines to fully exploit database technologies has never been better. There is a finite set of nonhardware-based activities (physical modeling) that improves query performance, whether directly or indirectly.

  1. Indexes: Indexes are presorted, concise, alternate forms of the base data. The downsides are the additional physical storage required and the fact that the database management system (DBMS) optimizers need to appropriately use them.
  2. Parallelism: Assuming that you have the redundant hardware components and the required software to exploit them, many systems afford you the opportunity to partition your data yourself according to expected access.
  3. Summary Tables: Most access does not need detailed data. Summary tables, however, can take up enormous amounts of space. They frequently must be built in numerous ways from the detail data for numerous views, thus creating a high amount of redundant data in the environment. Summary tables quickly get out of sync with the underlying base data. The maintenance actually presents a bigger issue than the space.
  4. Multidimensional Structures: Multidimensional structures, or cubes, are similar to summary tables, but much more compact, with built-in implied access paths that can be easily used by data access tools. The maintenance challenges are even more pronounced with cubes.
  5. Data Marts: Data marts are usually subsets of data spun from the data warehouse itself. These marts tend to be targeted to very specific purposes. This creates additional extract, transform, load (ETL) logic in the architecture, as well as lags in data availability.

These methods are mostly properly used in BI programs, but because of query performance issues, many still do not deliver on the BI promise of "the right data to the right people at the right time." Hardware upgrades and additions often seem to be the only means available to improve performance without radically changing the underlying architecture. Many end up scaling hardware in excess proportion to its delivered value, which translates into sublinear performance gains.

Indexes are the most elegant manner of improving query performance. They are easiest to implement (with a simple CREATE INDEX statement). Traditional data warehouse optimizers are fairly sophisticated in the selection of indexes for query processing. Most importantly, indexes are automatically maintained and require no changes to existing tables, applications or interactive environments. They are seamless to the end user, and nearly seamless to the database administrator.

Indexes could be the primary way to improve the performance of BI queries if not for the drawbacks of storage and the challenge of multitable joins. Classic index disk storage is a factor of their size being something similar to: key size times number of records. BI time-to-value could be improved tremendously if the physical modeling were simplified to the point of collecting and cleaning detailed data, and the performance considerations were minimized. More importantly, the value of the existing investment in BI applications could actually be extended to the size of the user community that could benefit from it.

It cannot be emphasized enough that as query performance improves, so does system throughput.

RightOrder QueryEdge provides new technologies and software for indexing that accentuate the positive qualities of indexing, while minimizing the negative aspects. It does this by:

  1. Highly compressing the keys to a fixed size (7 bytes), regardless of the number of columns in the key. Thus, a page/block I/O will gather many more keys.
  2. Not limiting the columns you can put in an index. QueryEdge helps ensure that a join of arbitrary complexity can be indexed without impacting the index size.
  3. Making the indexes precomputed joins so that at execution time multitable queries are not required to use multiple indexes or tables, just the QueryEdge index. This drastically reduces the amount of required disk and memory I/O and lowers the CPU consumption by eliminating a join calculation. This reduces the overall system load and even the queries that are not directly using QueryEdge indexes will run faster. Entire system throughput is increased.

Given the alternatives available today to performance tune database management systems, including long cycle time and costly and risky hardware alternatives, an investment in an indexing approach would seem to yield the biggest return in query performance and, by virtue of a nonintrusive implementation, is an option worth adding.    

Reference:

  1. Russom, Phil and Keith Gile. Implementing Successful Enterprise Business Intelligence. Forrester Research, Inc. Jan. 2003.
...............................................................................

For more information on related topics visit the following related portals...
Business Intelligence (BI) and Query & Reporting.

William McKnight has architected and directed the development of several of the largest and most successful business intelligence programs in the world and has experience with more than 50 business intelligence programs. He is senior vice president, Data Warehousing for Conversion Services International, Inc. (CSI), a leading provider of a new category of professional services focusing on strategic consulting, data warehousing, business intelligence and information technology management solutions. McKnight is a Southwest Entrepreneur of the Year Finalist, keynote speaker, an international speaker, a best practices judge, widely quoted on BI issues in the press, an expert witness, master's level instructor, author of the Reviewnet competency exams for data warehousing and has authored more than 80 articles and white papers. He is the business intelligence expert at www.searchcrm.com. McKnight is a former Information Technology Vice President of a Best Practices Business Intelligence Program and holds an MBA from Santa Clara University. He may be reached at (214) 514-1444 or wmcknight@csiwhq.com.



View Full Issue View Full Magazine Issue
E-mail This Column E-Mail This Column
Printer Friendly Version Printer-Friendly Version
Related Content Related Content
Request Reprints Request Reprints
Advertisement
advertisement
Site Map Terms of Use Privacy Policy
SourceMedia (c) 2006 DM Review and SourceMedia, Inc. All rights reserved.
SourceMedia is an Investcorp company.
Use, duplication, or sale of this service, or data contained herein, is strictly prohibited.