Portals eNewsletters Web Seminars dataWarehouse.com DM Review Magazine
DM Review | Covering Business Intelligence, Integration & Analytics
   Covering Business Intelligence, Integration & Analytics Advanced Search
advertisement

RESOURCE PORTALS
View all Portals

WEB SEMINARS
Scheduled Events

RESEARCH VAULT
White Paper Library
Research Papers

CAREERZONE
View Job Listings
Post a job

Advertisement

INFORMATION CENTER
DM Review Home
Newsletters
Current Magazine Issue
Magazine Archives
Online Columnists
Ask the Experts
Industry News
Search DM Review

GENERAL RESOURCES
Bookstore
Buyer's Guide
Glossary
Industry Events Calendar
Monthly Product Guides
Software Demo Lab
Vendor Listings

DM REVIEW
About Us
Press Releases
Awards
Advertising/Media Kit
Reprints
Magazine Subscriptions
Editorial Calendar
Contact Us
Customer Service

X-Engineering, Zero Latency Enterprise Will Put the Spotlight on Data Quality

  Article published in DM Direct Newsletter
October 21, 2005 Issue
 
  By Semyon Axelrod

According to James Champy, coauthor of the bestseller Reengineering the Corporation, the market players that can respond to the critical market events faster than their competitors will end up as winners in the emerging new economy.1 It is safe to assume that most of these market players have already reengineered their business processes within the corporation boundaries to achieve better efficiency. In order to win the next phase in the never-ending market race, they will also need to integrate their business processes with those of their suppliers and business partners. Additionally, the ability to quickly adjust processes to better respond to one's customers will also become a decisive factor in the new economy.2

In this type of economic environment, the latency between the initial market event (any kind of significant disruption to the market status quo) and a response from the integrated process chain cannot take months or even weeks. The winners will have days and sometimes only hours to react to the changes in the supply chain or a new customer trend.

Taking this into account, successful corporations that are aspiring to become winners in the new global race should be thinking about zero latency processing. For instance, in today's marketplace, a new financial services product usually guarantees its inventor a head start of a few months, typically resulting in substantial financial gains. However, if the other market players can respond within days instead of months, they can practically eliminate the competitor's advantage of being the first to market.

X-Engineering will Lead to Zero Latency Enterprise

Given that modern business processes rely heavily on information systems, and as market forces keep pushing companies toward faster updates to their business processes, the information systems' implementation/deployment cycle becomes more important. One of the more popular approaches - zero latency enterprise - encourages the creation of a feedback loop from the online analytical processing (OLAP) side of the tactical and even possibly strategic decisioning, into the online transaction processing (OLTP) systems in order to accelerate the event-response sequence.

As the Update Cycles Accelerate, Data Quality will Become even More Important

Traditionally, the quality of data stored in the enterprise data warehouse (EDW) significantly influences the quality of the decisioning process. In turn, the quality of data housed in the EDW is dependent on the quality of data produced by the OLTP systems. With margins contracting more and more every year, it is possible to consider that the difference between success and failure of some significant undertaking may depend on some relatively obscure operational attribute captured by some operational system and then consumed by the EDW. Unfortunately, the more complex the business processes is, the more difficult it is by the OLTP systems to produce high quality data. Further, in the near real-time (NRT) enterprise environment, the OLTP systems should be able to accept changes in the operational parameters that the OLAP/decision-support systems (DSS) produce.

In order to support this fast update cycle, a rules-based or a similar fast-deployment cycle technology should be used. The existence of a NRT feedback loop from the OLAP side back to the OLTP side of the enterprise supports very rapid changes to the business process, but at the same time exacerbates any inconsistencies and errors that result when information is transformed and loaded from the OLTP system into the EDW/OLAP side. For example, an erroneous calculation of loan processing costs in the EDW (based on an incorrectly captured operations time) may lead to an automated decision to open this financial product (loan type) to more clients. This decision would be automatically consumed by the appropriate operational systems with the Internet-enabled front end, and may significantly affect the business' financial characteristics. If it turns out that the calculation was wrong, the net effect may be a substantial loss instead of a hefty profit. The elimination of time-consuming manual steps in the process, while providing a corporation with the ability to respond very quickly to a market event, will at the same time put even more emphasis on decisioning and thus data quality.

Traditional Approach to Data Quality Needs Improvement

In a classical EDW environment, extract, transform and load (ETL) tools assume responsibility for data extraction from the source systems as well as transformating, cleansing and loading the data into the EDW/OLAP systems.

At the same time, the OLTP and OLAP developers have to address a rather different set of issues. It is not surprising that there is oftentimes a mental impedance mismatch between the OLTP and the OLAP staff that results in disagreements about:

  • Push versus pull (extract in ETL),
  • Data transformation responsibilities and techniques, and
  • Data cleansing approach.

Traditionally, IT departments rely on the teams responsible for the EDW/OLAP processing to address the ETL issues, using the old and unfortunately ineffective creed, "You need it, you do it."

This approach does not work well, as the cost of loading the OLAP data stores with the reliable high-quality data, and especially keeping the OLAP Data stores semantically synchronized with the changes in the source OLTP systems, is very high. In the most common scenarios, changes introduced on the OLTP side still require weeks, sometimes months, to be correctly reflected in the EDW/OLAP systems.

Making it Happen

Two factors have come together to change the traditional approach. As I have already pointed out, there is a demand from business leadership to shorten implementation cycles as well as the trend of integrating many intercorporation business processes into one end-to-end, highly efficient process. On the technology side, the emergence of and advancements in the service-oriented architecture (SOA) have created momentum in IT departments towards better understanding and thus modeling of business processes.

With the SOA advancement, the OLTP side is becoming much better structured: the issues at the syntax and communication protocol level are addressed, and boundaries are now explicit. The asynchronous nature of communication requires understanding, capturing and transmitting of time-state information. The advent of SOA is creating a foundation and an industry impetus to start viewing the data issues in a new light, connecting data issues more with the business process management. While the SOA way of thinking can help, it is not sufficient to address the issue of semantic differences between the source and target systems within the scope of the SOA framework. The differences in semantics are impossible to address without capturing enough contextual information to reason about these differences.

  • Timing
  • Relationship to the rest of the domain
  • Business process-level coordination.

Meet Data in Context

On my most recent project (for the medium-sized financial services company), the project team developed an approach that addresses significant issues that until now were preventing this company, as well as other companies, from realizing benefits of the NRT analytical decision-support technology. The cornerstone of this approach is the creation and rigorous maintenance of the rich contextual domain model on the OLTP side. The existence of this ontological model eliminates two main predicaments that exist on the way to the zero latency enterprise.

First, the rich contextual OLTP-side model enables and facilitates better understanding of information within the context of the business process, which in turn enables business process integration within as well as across the corporation boundaries.

Second, the OLTP side of the enterprise can now output information according to the specs produced by the EDW/OLAP/DSS side with improved quality and efficiency. While the ETL processes still exist, the cycle of producing the information required by the strategic and tactical decision makers is now significantly shorter.

Decentralized, Rich OLTP-Side Domain Model is the Key

A successful OLAP-side approach that capitalizes on a single enterprise meta data repository has not yet been successfully applied on the OLTP side of an enterprise.3 This is not surprising given that the business processes, and thus the various OLTP systems themselves, are much more diverse in nature than the more homogeneous OLAP-side systems. For instance, the operational processes in the acquisition department and trading desks are different in scope, use different terminology and have different key performance indicators.

Data architecture teams have realized that the diverse business process context may pose a problem on the way to creating a single OLTP-side meta data repository and have suggested an alternative approach. This approach strips the data of most of its business process context in order to make it easier to correlate data from different processes and OLTP systems.

One example of this common technique is a data dictionary approach. Unfortunately, this approach does not work too well in the long run: data divorced from its context rapidly becomes useless with the increase in business processes complexity. For instance, a typical data dictionary for a financial services company would have an address structure defined. While it may be sufficient for very simple cases, with an increase in business process complexity, data analysts and system developers find themselves dealing with numerous variations of the address structure: current client residence address, property address, client correspondence address, third party address and so forth.

Furthermore, the address case described is relatively simple compared to a situation of three- or four-layered hierarchical data structures, i.e., credit peport and credit score. Credit score, for example, may be aggregated at different levels: a loan, a borrower or a borrower group, with the borrower level provided by a number of credit vendors. Credit vendors, in turn, may use different aggregations of credit scores from the three credit repositories: Experian, TransUnion and EquiFax. Considering that these credit repositories in turn may use different scoring models, it quickly becomes apparent how fast information complexity can increase.

The ability to impeccably correlate data from the different OLTP systems across the contexts of their own business processes is absolutely essential in order to solve the data quality problem for conventional enterprise models, and even more so for zero latency enterprise models. Unfortunately, quite often due to lack of a well-defined system development process as well as shortage of analysts with appropriate modeling skills, this correlation analysis (sometimes called mapping) is left to the data analysts and developers. The skill set possessed by these groups of professionals, specifically one of physical integration of the OLTP RDBM-based systems, does not lend itself well to the domain/business process-modeling problem. Add to this a rather common absence of any meta data repositories for the business process models that would be available for an average Java or .NET developer, and the result is a status quo of tightly coupled physical database systems. This approach makes the entire OLTP side brittle and commonly producing unreliable, polluted data consumed by the OLAP side. This is typically followed by the never-ending cycle of blame for low data quality made apparent in the extraction, transformation and cleansing steps.

In order to successfully integrate OLTP systems, two main issues need to be addressed. First, every application, or group of applications, that will be considered as an independent processing entity with well-defined boundaries should have a rich meta-information repository. This repository will unambiguously define all the relevant data within the scope of the business processes supported by this system. For instance, I was recently part of an effort where the domain model had three main parts: business class model, business use case realizations, and system use case model. No data element would be added to the domain model unless it was initially called out in the business and system use cases.

Second, a well-defined process should be created and rigorously maintained for information correlation between the meta data repositories of the different areas. Each department should be responsible for the creation of its own domain model, but the correlation process and the artifacts that capture the results of this process (in our case, we call them overarching system use cases) are the joint responsibility of the departments that are integrating their business processes.

References:

1. Champy, James, and Michael Hammer. Rengineering the Corporation: A Manifesto for Business Revolution. New York: HarperBusiness, 2003.

2. Champy, James. X-Engineering the Corporation: Reinventing Your Business in the Digital Age. New York: Warner Business Books, 2003.

3. If somebody knows otherwise, please let me know. For all the years that I have been involved in IT (more then 25), I have never heard of a single successful case.

...............................................................................

For more information on related topics visit the following related portals...
Data Quality, OLAP and Real-Time Enterprise.

Semyon Axelrod has more than 25 years of experience in various areas of software engineering as well as management and information systems.  He lives in Minnesota where he specializes in enterprise architecture, business application development and integration as well as business process and systems modeling.  He can be reached at semyonaxelrod@yahoo.com.

Solutions Marketplace
Provided by IndustryBrains

Design Databases with ER/Studio: Free Trial
ER/Studio delivers next-generation data modeling. Multiple, distinct physical models based on a single logical model give you the tools you need to manage complex database environments and critical metadata in an intuitive user interface.

Recover SQL Server or Exchange in minutes
FREE WHITE PAPER. Recover SQL Server, Exchange or NTFS data within minutes with TimeSpring?s continuous data protection (CDP) software. No protection gaps, no scheduling requirements, no backup related slowdowns and no backup windows to manage.

Manage Data Center from Virtually Anywhere!
Learn how SecureLinx remote IT management products can quickly and easily give you the ability to securely manage data center equipment (servers, switches, routers, telecom equipment) from anywhere, at any time... even if the network is down.

KOM Networks Archiving and Data Storage
KOM Networks, a leader in archiving and data storage for more that 37 years, offers organizations a cost effective means to secure their growing data stores.

Data Mining: Levels I, II & III
Learn how experts build and deploy predictive models by attending The Modeling Agency's vendor-neutral courses. Leverage valuable information hidden within your data through predictive analytics. Click through to view upcoming events.

Click here to advertise in this space


E-mail This Article E-Mail This Article
Printer Friendly Version Printer-Friendly Version
Related Content Related Content
Request Reprints Request Reprints
advertisement
Site Map Terms of Use Privacy Policy
SourceMedia (c) 2006 DM Review and SourceMedia, Inc. All rights reserved.
SourceMedia is an Investcorp company.
Use, duplication, or sale of this service, or data contained herein, is strictly prohibited.