FREE DM Review Site Registration!
Sign-up today and access DM Review on the Web!

Your FREE registration entitles you to:

FREE email newsletters

FREE access to all DM Review content

FREE access to web seminars, resource portals, our white paper library and more!

   

Are there ways to improve the performance of data access through the reporting tool as well at the ETL process?

Question: We have a case where a data mart is being built on Oracle HR application. Due to the complexity of the application the data model for data mart has ended up being a 3NF. Because of this design there are a lot of performance issues. One of the solutions could be the creation of materialized views to mask the 3NF in the reporting tool for reports and for access through the reporting tool. Is there any other way to improve performance of data access through the reporting tool? ETL performance is also a problem. Are there any tried and tested practices to be followed in such cases for improving performance? Currently incremental loads are being executed.

Sid Adelman's Answer: Before you look at solutions, you need to know why there are performance problems on both the BI and the ETL sides which could be caused by any number of factors with many of them related and interdependent. Your question gave no specifics on your diagnosis of the performance problem. We have seen organizations throw money and effort at parts of the system that don't have a problem. What parts of your system have you monitored and what have you determined from the monitoring? Only after you have discovered where you have the problems should you evaluate solutions to address them.

Evan Levy's Answer: This is a common problem - and one that has several solutions.

Based upon your remarks, it sounds as though the performance problems you're running into are caused by queries that require multiple table joins as well as full table scans that are occurring against those tables. You may also be running into load performance issues if your tables have a large number of indexes.

We often recommend our clients develop a series of secondary tables that are dimensional in structure to support reporting activities. This requires a two-step ETL process. Step 1 focuses on transformation, data cleansing and loading data into the third normal form (3NF) tables. Step 2 loads data from the 3NF environment directly into the reporting tables.

While this does introduce a second load step and some additional storage, it's a common approach for supporting high-performance reporting.

We often see a "kill and fill" approach work better when building a reporting table than using incremental loads. Much of this has to do with some performance features that most DBMSs have when loading empty tables.


Sid Adelman is a principal in Sid Adelman & Associates, an organization specializing in planning and implementing data warehouses, in data warehouse and BI assessments, and in establishing effective data architectures and strategies. He is a regular speaker at DW conferences. Adelman chairs the "Ask the Experts" column on www.dmreview.com. He is a frequent contributor to journals that focus on data warehousing. He co-authored Data Warehouse Project Management and is the principal author on Impossible Data Warehouse Situations with Solutions from the Experts and Data Strategy. He can be reached at (818) 783-9634 or visit his Web site at www.sidadelman.com.

Evan Levy is a partner and co-founder of Baseline Consulting Group, a multivendor systems integration and consulting firm. As the partner in charge of Baseline’s largest practice, Levy leads both executives and practitioners in delivering technology solutions that help business users make better decisions. He has led strategic technology implementations at commercial and public sector organizations and advises vendors on their product development and delivery strategies. Levy has been published in a wide array of industry magazines and has lectured on a range of technology delivery experiences at leading conferences and vendor events. He has been a featured speaker at the Marcus Evans Analytical CRM symposium, DCI’s Data Warehousing conference, the CRM Association, DAMA International, the AMA and the Data Warehousing Institute. His current work involves delivering and lecturing extensively on the topic of data integration. You can contact him at evanlevy@baseline-consulting.com.

For more information on related topics, visit the following channels:



Industry Vendors