Portals eNewsletters Web Seminars dataWarehouse.com DM Review Magazine
DM Review | Covering Business Intelligence, Integration & Analytics
   Covering Business Intelligence, Integration & Analytics Advanced Search

View all Portals

Scheduled Events

White Paper Library
Research Papers

View Job Listings
Post a job


DM Review Home
Current Magazine Issue
Magazine Archives
Online Columnists
Ask the Experts
Industry News
Search DM Review

Buyer's Guide
Industry Events Calendar
Monthly Product Guides
Software Demo Lab
Vendor Listings

About Us
Press Releases
Advertising/Media Kit
Magazine Subscriptions
Editorial Calendar
Contact Us
Customer Service

Business Intelligence:
Technical Considerations When Business Intelligence Applications Access ERP Databases

online columnist Jonathan Wu     Column published in DMReview.com
May 30, 2003
  By Jonathan Wu

Since its introduction in the early 1980s, enterprise resource planning (ERP) software has replaced custom-built applications and has become the primary transactional system of most established organizations. ERP software provides the ability to efficiently integrate departments and functions into a single application by automating business processes that conduct and record the activities of the organization.

ERP Data Access Complexities

Most ERP applications utilize relational database management systems (RDBMSs) as the primary software to manage and store the data that is collected and processed. While ERP applications are able to integrate the functions of the organization, accessing the data from the database is challenging. Having extensive experience in accessing ERP databases for reporting, ad hoc query and decision support, there are several technical aspects that should be considered.

Database Schema

Most ERP databases have been designed based upon the principles of data normalizations. These principles are a set of techniques for organizing data into tables in a manner that eliminates redundancy and incompleteness. This in turn reduces the amount of storage needed by databases and provides a technique to logically organize data. However, most database designers or data modelers loosely adhere to these principles when developing the database schema to support ERP applications. There is a tendency to stray from these principles when designing a database schema because it requires theoretical discipline, which is often difficult to achieve when trying to satisfy business requirements. ERP database schemas that utilize the principles of data normalization usually do not go beyond the third normal form (3NF), which means that the data is organized in the following manner:

  • First Normal Form (1NF) requires that a table contain columns in which data of the same kind is located, repeating groups are separated into new tables and that a column or columns are designated at the primary key to uniquely identify a row of data within the table.
  • Second Normal Form (2NF) requires further refinement of a table by identifying and separating subsets of data that appear in more that one row into new tables and the creation of relationships between tables by the use of primary and foreign keys.
  • Third Normal Form (3NF) requires that that every column within a table be dependent on the primary key for its value. If a column is not dependent, then a separate table should be created for it.

Each form of normalization builds upon the previous form. Therefore, in order to be in 3NF, the data must have been organized in a manner that achieves 1NF and 2NF. While these are the principles of data normalization, understanding the ERP database schema can be very challenging because the software vendors rarely provide a complete logical model, which means that you will have to piece together the limited information that they provide. Also, the database schema may not be intuitive if the creator significantly deviated from the principles of data normalization.

Naming Conventions

One of the many distinguishing characteristics between the various ERP software vendors is their naming convention for the database schema tables and columns. The naming conventions that are used by ERP software vendors are unique to their products and require a technical reference manual in order to decipher and understand the purpose of a table or the characteristics of a column. For example, if you were trying to locate the purchase order number within the database schema of one ERP vendor, you would have to locate the column named "segment1" within the table, "PO_HEADERS." This naming convention is not intuitive and requires an intimate knowledge of the ERP database schema in order to extract data.

Complex Relationships

The relationships that often exist within an ERP database are complex. Complex relationships, such as self- referencing and many to many can be a problem when trying to generate a structured query language (SQL) statement to extract the data. These relationships may require multiple passes of the database or nested subqueries within an SQL statement in order to extract the data that has been requested. Understanding these relationships and how to resolve them from a data extraction perspective requires an intimate knowledge of the ERP database schema and a set of technical reference manuals.

Indexing Strategies

As information is entered into the ERP application, data is inserted, updated or deleted in the database. The indexes on columns within a table are designed for data entry and processing by the ERP application. Changing or adding indexes to an ERP database can cause the application to have performance problems and with certain vendors, void the product warranty.

Approaches to Accessing ERP Data

There are several approaches that one can take to configure a business intelligence (BI) application to extract data from an ERP database. Each approach has its benefits and drawbacks. The most commonly used approaches are:

Direct Access. This approach is often performed first because it requires no additional work other than the configuration of the BI application to access the ERP database. There are several items to consider with this approach. First, can the BI application generate complex SQL statements? Second, does the query performance meet the expectations of the users of the BI application? If you answered "yes" to both of these questions, then this is the approach for you. If not, evaluate the next approach.

Views. If problems are encountered with direct access because the BI application cannot handle complex SQL statements or query performance is slow, then another approach to consider is the use of a database view. The BI application can access a view, which is a predefined SQL statement that can resolve the issues with complex relationships and can be tuned to provide faster query performance. It is a quick solution to resolving the problem of extracting a predefined set of data from a complex database schema. However, a view is predefined and can have limited value if users are developing ad hoc queries or requesting data that is not within the view. It can also become a maintenance nightmare if users' requirements are constantly changing. The problem can be exacerbated if the views are not properly documented or cataloged. If you do not have the ability to efficiently maintain the views or if user requirements are constantly changing, you should probably consider another approach to accessing ERP data.

Materialized Views. Unlike a view, a materialized view does not need to be compiled every time it is referenced. The data is compiled when it is created and therefore, provides faster query performance. However, view maintenance must occur whenever there are changes to the underlying data structure. In addition, the maintenance and functionality issues are the same as with a view.

Multidimensional Cube. A multidimensional cube is a predefined set of data that users of a BI application can access. It provides optimal performance since the result set has already been generated. However, similar to the previous two approaches, the generation of multidimensional cubes can become a maintenance nightmare if users' requirements are constantly changing or the underlying data needs to be frequently refreshed.

Denormalized Table. Reassembling data from several existing tables creates a denormalized table, which can be incorporated into an ERP database schema to improve performance while providing the functionality for ad hoc queries. This approach is a step toward creating a data mart solution since it requires the extraction, transformation and loading of data from the existing tables into the denormalized table. The drawback to this approach is the refresh and maintenance of the denormalized table.

Dimension Model. A dimension model is the optimal approach to creating an information solution for users who need fast query performance and functionality to address their ad hoc business questions. This solution provides the greatest flexibility from a user perspective because they are not constrained by the structure of the view or a predefined result set. However, it requires the extraction, transformation and loading of data from the ERP database schema into a dimensional or star schema.


While ERP databases are rich in data content, that ability to easily and quickly extract data for reporting, analysis and decision-making purposes can be difficult. There are several approaches that one should consider when embarking on this adventure: each has its benefits and drawbacks. The challenge is for you to select the best approach with the BI application that you have available to meet the needs of the user community.


For more information on related topics visit the following related portals...
ERP and Business Intelligence.

Jonathan Wu is a senior principal with Knightsbridge Solutions. He has extensive experience designing, developing and implementing information solutions for reporting, analysis and decision-making purposes. Serving Fortune 500 organizations, Knightsbridge delivers actionable and measurable business results that inform decision making, optimize IT efficiency and improve business performance.  Focusing exclusively on the information management disciplines of data warehousing, data integration, information quality and business intelligence, Knightsbridge delivers practical solutions that reduce time, reduce cost and reduce risk. Wu may be reached at jwu@knightsbridge.com.

Solutions Marketplace
Provided by IndustryBrains

Numara Track-It! Help Desk Software
Numara provides Track-It! - the leading help desk solution for call tracking, problem resolution, IT asset management, LAN/PC auditing, patch management, electronic software distribution, remote control, and more. Free demo

Way better than backup! - Continuous Protection
Free White Paper. TimeSpring's CDP software for Windows automatically captures all changes in REAL TIME so you can recover from data loss, corruption or security breaches in minutes. Analyze, report and test data anytime without impacting operations.

Website tracking, web statistics and analytics
Watch your visitors in real time as they browse your site. Website statistics provide insight. Understanding website traffic and visitor clickstream behavior is crucial to managing a website on a daily basis. Real-Time reporting. 4 week free trial.

Rosette Linguistics Platform
Basis Technology utilizes powerful techniques to provide software solutions for extracting meaningful intelligence from unstructured text in Asian, European and Middle Eastern languages.

Click here to advertise in this space

E-mail This Column E-Mail This Column
Printer Friendly Version Printer-Friendly Version
Related Content Related Content
Request Reprints Request Reprints
Site Map Terms of Use Privacy Policy
SourceMedia (c) 2006 DM Review and SourceMedia, Inc. All rights reserved.
SourceMedia is an Investcorp company.
Use, duplication, or sale of this service, or data contained herein, is strictly prohibited.