Portals eNewsletters Web Seminars dataWarehouse.com DM Review Magazine
DM Review | Covering Business Intelligence, Integration & Analytics
   Covering Business Intelligence, Integration & Analytics Advanced Search
advertisement

RESOURCE PORTALS
View all Portals

WEB SEMINARS
Scheduled Events

RESEARCH VAULT
White Paper Library
Research Papers

CAREERZONE
View Job Listings
Post a job

Advertisement

INFORMATION CENTER
DM Review Home
Newsletters
Current Magazine Issue
Magazine Archives
Online Columnists
Ask the Experts
Industry News
Search DM Review

GENERAL RESOURCES
Bookstore
Buyer's Guide
Glossary
Industry Events Calendar
Monthly Product Guides
Software Demo Lab
Vendor Listings

DM REVIEW
About Us
Press Releases
Awards
Advertising/Media Kit
Reprints
Magazine Subscriptions
Editorial Calendar
Contact Us
Customer Service

Enterprise Information Integration: A Pragmatic Approach

  Article published in DM Direct Newsletter
June 10, 2005 Issue
 
  By JP Morgenthal

Enterprise Information Integration: A Pragmatic Approach (ISBN 1-4116-2974-4) by DM Review online columnist JP Morgenthal is available at http://www.lulu.com/content/121621. The book is also available at Amazon.com: http://www.amazon.com/exec/obidos/ASIN/1411629744/qid=1117737587/sr=11-1/ref=sr_11_1/103-4642975-8619036.

An organization that has a modern IT infrastructure, but can't rapidly answer questions about its business, is like an explorer who has a map, but doesn't know how to read it! Information and knowledge are the best tools for navigating an increasingly competitive business environment.

There is data everywhere you look. It is on your receipts when you go food shopping. It is on the monthly financial statements and bills you receive by mail. It's even on the front and back of your car. Before there were computers, there was data. Now, the only difference is that we've taken to storing much of that data electronically, so that we can access it more quickly and easily.

It should be no surprise that information technology departments within organizations have been ignored in favor of sales, marketing and finance. Computers were created to increase the productivity of all other departments. When we increase productivity, we increase output without increasing the cost of producing that output. Hence, computers are just an enabler for creating more wealth, and they've accomplished this mission well. We've tripled the size of the economy since the first mainframes came online.1

Productivity, however, is no longer providing leading countries with enough growth to maintain the competitive advantages they have come to rely on. Influences from globalization, such as lower costs of labor in developing countries and fluctuations in currencies, mean that companies need to produce higher quality products, and products that meet individual customer demands, even if meeting these requirements entails extensive customization of the product.

For example, the auto industry has gone from the famous Ford statement, "You can have any color you'd like, as long as it's black," to auto manufacturers using colors as a market differentiator, as part of an assembly line process, to customers having a car built to their specification and delivered to their door. This need to move away from the assembly line mentality, to a service-oriented mentality has permeated all industries and is causing a revolution. As a result, business leaders are now required to pay attention to the data in those systems that merely provided automation of routine tasks in the past, to glean intelligence about the industry their in, and the customers that they do business with.

Unfortunately, many of the systems we're relying on today, to provide us critical business information, were not designed with this task in mind. Many of these systems were designed for the sole purpose of increasing productivity. Thus, only the data necessary to produce the intended increase was incorporated into the system. Fortunately, the need for greater and greater productivity has forced us to change systems or build new ones to meet the demand. This process has resulted in data accumulating in rough layers that have been loosely knitted together, to provide us with many of the answers we need today to remain competitive. However, these layers have become so fragmented and isolated, that it requires Herculean efforts to make sense of it.

To drive this point home even further, the CEO of a major logistics and transportation company called his efforts to obtain a consolidated view of all business done with a particular trucking company across all business units, a "Chinese fire drill." This quote indicates that his information systems do not support his needs, and instead, require the chaotic process of humans manually pulling data and consolidating it, in order to obtain the values this CEO needs.

If we're looking for greater productivity in our organizations to increase net profits, here's a prime place to start. Of course, there are two ways to approach this information problem: 1) rip and replace, or 2) integrate. Neither is perfect. Both are expensive. However, since rip and replace is not a one-for-one replacement, it requires that all new systems be developed, deployed and parallel-tested for at least six months before they can be relied upon in a production environment. By the end of this process, two to five years might pass, along with changes in the market and economy, without having been accounted for in this strategy.

The alternative approach, integration, can be performed on an as-needed-basis, continually delivering value of the current system in a production-level manner, while meeting the needs of the users for new data structures and applications. However, bringing together the layers of data that have been accumulating for years, will present a daunting challenge that requires a novel approach. This approach is known as Enterprise Information Integration (EII). It includes specialized software that is designed to meet these challenges.

If you're reading this book, you're most likely interested in identifying better ways to leverage the data assets within your organization. You may be concerned about the emergence of yet another approach to integration, after significant dollars were already spent on integration technology. You may find it interesting that, in survey after survey, chief information officers consistently define integration as one the problems that keep them up at night. As other industry leaders in the integration space have noted, there are no "silver bullets" when it comes to integration. Each new approach hopefully brings with it, the ability to satisfy integration requirements faster, less expensively, and with fewer resources.

While we will explore the benefits of EII in relation to other integration technologies in the next chapter, let's now look at some of the business drivers that have lead to a need for EII. Most Information Systems in production today suffer from these major drawbacks:

  • Require IT assistance for end-user access
  • Difficult for end users to identify relevant information
  • Overload of data delivery
  • Provide only partial answers to questions
  • Often present out-of-date information
  • Contain enormous amounts of redundant data
  • Expensive to develop and maintain

The truth of the matter is that we're delivering data, not information. Notice that we're being very specific by not using the terms data and information interchangeably. Most people speak of data and information as if they are the same thing. This confusion of terms is one of the leading causes of the problems in information systems, as noted above. These are distinct concepts, as we will see shortly.

EII is an approach to integration that has arisen out the need for organizations to identify and correlate related, but separate, data. It allows users to derive new data structures and information models without having to understand the nuances of underlying data structures, data locations, data types, etc. In essence, EII solutions provide access to the data without the hindrances of the underlying technology selection.

One common problem EII has been applied to is identified as a single view of the customer. If you consider the IS infrastructure of any modern mid- to large-sized organization, you will find a number of legacy systems that have related and, sometimes, replicated data. The reasons for the emergence of these differing systems include: performance, response times, data governance, etc. However, the pragmatic approaches taken by IS to respond to users' needs has resulted in an inability for the company to see clearly and accurately across these systems.

While integration techniques such as Enterprise Application Integration (EAI) and Service-Oriented Integration (SOI) exist, there are requirements to harness the mounds of raw data within our organizations and enable users to more easily identify, correlate, process, and reuse this data within the business's processes. That is, EII may have emerged because of requirements, such as single view. However, the integration technique answers the much larger need for making data more digestible and accessible regardless of one's role within the organization.

This book is intended to satisfy the needs of those who want to quickly gain insight into EII as a new approach toward integration, and to better understand where and how this technology can be applied to solve complex data problems.

What is Information?

MIT Scholar Geoffrey Brooke sums up the relationship between people and information best. "The more information there is, the more time you have to spend converting it (into knowledge). We've made it easy to produce, collect and transmit information. We haven't made it easy to consume information. Consuming information is just as slow as it always was."2 Enterprise Information Integration is an architectural approach that not only makes it easier to create, but also consume, information.

Information is defined as the communication or reception of knowledge or intelligence. If you receive a table of numbers with no headers and no explanation, you've received data, not information. It isn't information until the headers and other metadata are aggregated with the data, that you have information.

The Aberdeen Group explains this as the task of strategic information management. They describe the short-term goals of strategic information management, as the ability to deliver available information in a consistent manner, on demand. They also describe the long-term job of strategic information management to identify, manage, add, and support the leveraging of the information resources of the organization.3

An EII solution was recently implemented in response to the requirements of a major credit card company in the United States. The solution needed to meet the company's overall goals. They included:

  • Support dynamic integration of new portfolios
  • Overall corporate agility
  • Alignment of Information Systems with the goals of the business
  • Increase effectiveness in managing portfolios
  • Become more efficient in the use of corporate information

Clearly, this company understands the impact that information and Information Systems has on corporate growth. However, this company also realized that their existing Information Systems infrastructure was a hurdle to achieving their goals. They needed to align the employees and applications around a common business vocabulary to drive consistency and transform their data into information.

A recent IDC report lists "single customer view across channels and offerings," as one of six CEO-Level business priorities that will drive IT spending growth.4 This is one of the driving factors behind implementing EII today. This report goes on to say, "In 2004, regulatory mandates, deregulation, consolidation, and more sophisticated, segmented channel management will increase the urgency." Thus, EII solutions will have a major role to play in fulfilling the demands in this area.

There are a number of processes that data might undergo on the way to becoming information:

  • Aggregation - the process of combining data from disparate sources into a new structure
  • Consolidation - the process of summarizing groups of related data
  • Transformation - the process of turning one set of data in a given structure into one or more sets of data in potentially different structures
  • Filtering - the process of identifying and removing pieces of data that are not relevant to the current process
  • Validation - the process of ensuring the validity of the data by comparing its structure and content, against a predefined set of criteria
  • Cleansing - the process of ensuring the validity of the data by meeting canonical specifications

Moreover, data may pass through each of these processes multiple times, as part of a single business process, or an information gathering exercise.

The implementations of these processes constitute only part of a complete EII solution. They are merely operations performed on data in order to prepare it for other uses. There is a higher-level component to EII that manages the correlation of data. Correlation is what allows us to connect fragmented islands of data sets into a coherent whole.

When developing a normalized relational database, the wonderful thing is that all the tables can operate from a common key. In contrast, when operating against fragmented, disparate data sets, different conventions may be used, in order to establish the identity of a single logical record. EII's correlation process facilitates mapping identity conventions across data sets to each other, and provides the necessary rules for synthesizing logically coherent records.

The following scenario illustrates this problem. A single company with three acquired business units may have three different ways to identify their customers. The parent company's ability to identify the total business they are doing with any one customer necessitates an information engineering exercise, to map the entities in a common approach. For example, the parent company will first have to develop the ways and means of identifying that Company ACME in Business Unit X, is the same as A.C.M.E in Business Unit Y (see Figure 1). The implementation of this may be as simple as using a lookup table that associates a business unit customer number with a corporate customer number.

Figure 1: Identifying Related Data across Differing Data Sets

However, there is a requirement here for a human to sit down and look at the attributes of the customers, such as name, address, billing contacts, etc. to determine that ACME Corp. truly is the same as A.C.M.E. Corp. String matching will help in some cases, but more advanced data cleansing tools may be needed in order to determine that these companies truly are one and the same. It is a very realistic scenario that each of these business units might each use different representations, in order to identify the same company.

In a nutshell, these are the types of problems that EII attempts to solve. For the businessperson, EII removes the barriers to accessing enterprise data and provides a common infrastructure to transform that data into usable information, usually on demand. For the technologist, EII simplifies the longstanding data integration problem by applying a "semantic veneer" over the complex physical data layer.

EII increases the productivity of front office personnel by:

  • Providing high-quality data
  • Simplifying access to data from front office applications
  • Providing well-understood meaning around the data across the organization
  • Delivering context around the data

As we will see in Chapter 2, EII shares common traits with other integration approaches. If, as you read the previous paragraph, you thought to yourself that you could accomplish this today with the integration tools that you have, you would be correct. However, the remaining questions are how easily you can do it, with the integration tools you have now, and whether it would be more easily done with a tool designed specifically for the task of turning raw data into powerful, reusable information? Throughout this book we will continually revisit these questions as we learn more and more about the EII approach.

What is Knowledge?

Knowledge is what we now seek from our information management systems. In 1994, Shlomo Maital predicted in his book, Executive Economics, that "the knowledge economy will radically change the way executives make decisions about labor, capital and knowledge."5 One of the factors that lead him to this conclusion was that knowledge expands, the more widely it is shared, and the more intensively it is used. He also noted that knowledge is created, shared, and disseminated faster and better in smaller organizations rather than large ones. Hence, our information systems must now provide us with more than just basic automation. They must help us understand our world. They must convey knowledge.

Knowledge has been defined as the body of truth, information, and principles acquired by mankind. This definition shows that information is just one component of knowledge. EII solutions do not address knowledge directly, but they are an important step toward creating it. The patterns and facts that we learn or derive by examining information provide us with the knowledge we need to make good decisions. Figure 2 illustrates the hierarchy of reasoning necessary to distill knowledge from raw data.

Figure 2: Data Hierarchy

EII will take us from raw data to information. We still require analytic tools, such as inference engines, to help extract knowledge from the information we create. For example, EII can provide us insight into our customer base by consolidating the customer information into a single structure that can be analyzed simultaneously. This is the benefit of information integration, but EII does not provide us the ability to readily identify that the customer tends to shop on Monday, which is the day after the flyer is delivered. Analytic tools are designed to look for these patterns, but they need the structure derived from the EII solution to be able to have enough supporting data to make such an assertion.

Why Invest in EII?

Today's science fiction movies and shows are filled with references to a world where computer systems dynamically communicate with no human assistance. The virtual information screens that Tom Cruise combines at will in "Minority Report," and the ability for Star Trek's Enterprise to analyze even the most obscure and unfamiliar ship upon first contact, are just small examples of this capability. In a perfect world, there would be no need for EII software, because all data would be available to even the most novice user, with little or no effort.

However, the world we live in is comprised of systems that run on different hardware platforms, use different operating systems, have proprietary data formats, and run applications without external interfaces. In short, our data is fragmented and accessible to only those who have the capability to unlock it from its imprisonment.

With EII, we can take steps to provide novice users with the means to view data from a perspective that matches their specific needs. For example, allowing newly merged companies to create a comprehensive view of the combined customer base only weeks after the merger, instead of months, or help health and human services case workers recognize inappropriate child placements prior to unfortunate events.

Moreover, EII provides a foundation layer for the eventual incorporation of context-dependent uses of data in new applications and business processes. For example, once a set of data has been absorbed into the EII layer, the workflow and business process management tools can start to leverage the vocabulary that has been captured as the primary means of creating units of work. That is, instead of business analysts struggling to understand technical interfaces to applications and services, analysts can focus on a vocabulary that they understand and work with everyday, as a means to drive new business automation.

We will explore this concept throughout the book, but it is important enough to reiterate now, that EII provides a means to drive automation that hides (or abstracts) the underlying technical services from the tools and products that will consume data, which is all that the business is eventually ever interested in. Business process reengineering is even analyzed and reviewed for success, as a series of metrics that underlie that change.

Key Concepts

  1. Organizations need to gain control over their data, if they want to maximize growth and productivity
  2. The terms data, and information, should not be used interchangeably. Information is data that's been processed, and is accompanied by a supporting context and has implied some intelligence or knowledge.
  3. The operations that help create information from raw data are aggregation, consolidation, transformation, filtering, validation and cleansing
  4. Analytic tools are required in order to create information and knowledge from raw data, but EII tools can provide the base from which knowledge can be discussed.

References:

1. Based on real GDP from 1959-1996.
2. MIT Management. Spring 1992. P. 50.
3. Enterprise Information Integration. The New Way to Leverage E-Information, Second Edition. July 2003.
4. IDC Preditcion 2004: New IT Growth Wave, New Game Plan.
5. Executive Economics. Maital. P. 114.

...............................................................................

For more information on related topics visit the following related portals...
Enterprise Information Integration (EII).

JP Morgenthal is managing partner for Avorcor, an IT consultancy that focuses on integration and legacy modernization. He is also author of Enterprise Information Integration: A Pragmatic Approach. Questions or comments regarding this article can be directed to JP via e-mail at morgenthaljp@avorcor.com. Do you have and idea for a future Enterprise Architecture column? Send it to JP; and if it is used, you will win a free copy of his book.

Solutions Marketplace
Provided by IndustryBrains

Verify Data at the Point of Collection: Free Trial
Protect against fraud, waste and excess marketing costs by cleaning your customer database of inaccurate, incomplete or undeliverable addresses. Add on phone check, name parsing and geo-coding as needed. FREE trial of Data Quality dev tools here.

Design Databases with ER/Studio: Free Trial
ER/Studio delivers next-generation data modeling. Multiple, distinct physical models based on a single logical model give you the tools you need to manage complex database environments and critical metadata in an intuitive user interface.

Free EII Buyer's Guide
Understand EII - Trends. Tech. Apps. Calculate ROI. Download Now.

Data Mining: Levels I, II & III
Learn how experts build and deploy predictive models by attending The Modeling Agency's vendor-neutral courses. Leverage valuable information hidden within your data through predictive analytics. Click through to view upcoming events.

Click here to advertise in this space


E-mail This Article E-Mail This Article
Printer Friendly Version Printer-Friendly Version
Related Content Related Content
Request Reprints Request Reprints
advertisement
Site Map Terms of Use Privacy Policy
SourceMedia (c) 2006 DM Review and SourceMedia, Inc. All rights reserved.
SourceMedia is an Investcorp company.
Use, duplication, or sale of this service, or data contained herein, is strictly prohibited.