|Sign-Up for Free Exclusive Services:||Portals|||||eNewsletters|||||Web Seminars|||||dataWarehouse.com|||||DM Review Magazine|
|Covering Business Intelligence, Integration & Analytics||Advanced Search|
Data Mart Migration
Independent data marts have spread like a disease through many of today's best and most advanced corporations. The devastating nature of this disease is that it is not easily detected in its initial stages; however, if it is not treated, the corporation's condition will steadily deteriorate.
"Stranded on Islands of Data" in the November 1998 issue of DM Review covered the characteristics of independent data marts, the flaws in their architecture and the reasons why they exist. This follow-up article focuses on the approaches for migration, initial planning, how to identify a migration path and also presents a fictitious case study illustrating how a corporation can migrate from independent data marts to an architected solution of an enterprise data warehouse with multiple dependent data marts.
Approaches to Migration
There are two general approaches for migration: "Big Bang" and "Iterative." Table 1 summarizes the advantages and disadvantages of each approach.
Big Bang Approach: As the name implies, all of the independent data marts will be reengineered simultaneously into a structured DSS architecture. There are a couple of advantages to this approach. First, it can provide the fastest path for migration. Often companies will need to change their DSS architecture as quickly as possible because of a need to implement additional DSS projects that promise to generate a high ROI. Second, this approach allows for immediate economies of scale rather than slowly attaining them in an iterative approach. The disadvantages to this approach are that it is labor intensive and requires tremendous coordination. In addition, the "Big Bang" approach is the more complex to implement and thus provides the highest exposure.
This approach is best suited when the independent data mart problem is relatively small and not highly complex. However, when the problem is large, the complexity of the migration grows at a tremendous rate.
Iterative Approach: This approach reengineers the independent data marts (one or two data marts at a time) in manageable phases. The advantages to this approach are several. First, it allows a company to manage and reduce the risk involved in a migration effort. This occurs because the migration can be accomplished in a phased manner, thereby increasing the probability of the project's success. Second, as each project phase is executed, lessons are learned and leveraged for subsequent phases. This is very valuable as typically once the first phase is completed, the follow-up phases run much more smoothly.
The major disadvantage to this approach is that it takes longer to fully complete the migration. This approach is best used when the independent data mart problem is large and too complex to tackle all at once.
Many companies fail in their migration efforts well before they start. The chief reason for this is the lack of initial planning and sponsorship. Attaining executive sponsorship is one of the most important tasks at the onset of the project. This is critical, as typically autonomous teams in different corporate departments have constructed each independent data mart. Therefore, having a project champion that has cross-departmental authority is critical for dealing with the political challenges that are commonplace in these migration efforts.
During the initial planning phases, it is important to plan on implementing a meta data repository that can support future DSS development efforts and that will provide a semantic layer between the business users and the DSS system. The data mart migration provides an outstanding opportunity to implement the meta data repository. Before the data mart migration begins, it is best to standardize the data naming nomenclature for the DSS system. Implementing standard data naming nomenclature will aid in the DSS system's maintenance and provide cleaner and more understandable meta data.
A great deal of research needs to be conducted on the independent data marts before a migration is possible. (Table 2 summarizes these tasks.) The most important research activity is to understand the business needs that each independent data mart is meeting. Typically multiple independent data marts will exist to meet the same or similar business needs. These situations are common and do suggest a path for migration. The results of this research will identify the independent data marts that will be the most difficult to migrate.
During independent data mart migration is an excellent time to standardize on hardware and software for the DSS project. For each differing software or hardware platform, a company needs to have trained personnel to support it. Therefore, by limiting the redundant software/hardware, the corporation reduces the support strain on their IT staff. In addition, purchasing economies of scale can also be achieved.
The central covenant of any independent data mart migration effort is to "never deliver less functionality to the business users than what they have today." Generally business users do not react well to spending money on infrastructure because they don't initially see its value. The key business users need to understand that a bad system architecture leads to a non-scalable and non-flexible system that will eventually need to be rewritten at a very high cost. Therefore, during migration the users must be assured that they will not receive less functionality (information, ease of use and response time) than they are currently receiving.
There are several activities that need to be conducted before a migration path will be evident.
First, diagram the current DSS architecture. This is critical for identifying which legacy systems are feeding which independent data marts (See Figure 1).
Often independent data marts will be sourced from the same legacy systems. By targeting independent data marts with the same source data, multiple independent data marts often can be removed with minimal effort. Identifying redundant data often suggests a migration path.
Identify Paths of Least Resistance
Data. It is important to target those independent data marts whose data will most likely be used in future DSS efforts. By targeting these data marts first, it will ease the task of keeping all new DSS development activity within the newly architected environment.
The next step is to identify those data marts whose transformation rules are known and documented. Understand that even the best-documented transformation rules will have gaps. Moreover, even those marts that have been built using ETL (extraction/transformation/load) tools have meta data (documentation) gaps. For example, ETL tools many times provide the functionality to call user exits that are hand-coded programs. The processes performed by these user exits will not be captured in the ETL tool's meta data stores. If documentation does not exist for a mart, programmers will need to manually analyze the ETL program's code to extract the transformation rules. Manually analyzing code to extract transformation rules is a very time-consuming and expensive activity.
Political. It will be critical to obtain support from the current independent data mart IT teams and business users. Identify those data mart teams most likely to work cooperatively with the centralized DSS team. Recognize the strengths and weaknesses of those teams that can and will provide the most aid. If particular data mart teams/business users are not willing to assist with the migration effort, it is best to delay the migration of their particular data mart. If this is not an option, utilize your executive sponsorship to "motivate" this group to provide their support.
Strengths and weaknesses. Keep in mind that any team will have its stronger and weaker areas of knowledge. As much as possible, keep teams' areas of weakness off of the critical path. Any mission-critical team weaknesses need to be shored up with internal members from the other data mart teams or from outside vendors.
The following case study puts the concepts we've discussed into action. This case study illustrates the iterative approach to independent data mart migration.
The XYZ company is a Fortune 500 consumer electronics firm. XYZ recently acquired a smaller company (Acme Electronics) that has a single marketing data mart about which little is known. In addition, XYZ is standardizing on a new order entry system in five years, and existing batch windows for the legacy systems have reached their limit. XYZ's management team is stable, well organized and fully supports the migration effort. Table 3 lists the DSS specific details.
Phase One: By viewing the data, it is evident that XYZ's marketing and finance data marts share two common data sources (old and new order entry systems). In addition, the XYZ marketing data mart has a strong end-user community that will be highly supportive of the migration effort. In addition, both the marketing and finance data marts' business users have agreed to freeze their additional functionality requests for phase one of the migration.
Phase one does not include migrating the XYZ quality control data mart or the Acme marketing data mart due to the lack of support in the quality control mart and all the unknowns associated with the Acme marketing mart.
Phase Two: During this phase, the operational logistical system's data will be brought into the data warehouse and the quality control data mart is now being sourced directly from the enterprise data warehouse. In addition, during this phase the marketing and finance teams change requests that were frozen during phase one are now being developed. Lastly, a new dependent accounting data mart is now being sourced from the data warehouse.
Phase Three: This phase merges the functionality of the former Acme Electronics marketing data mart into the existing dependent marketing data mart. Also, additional data marts are continuing to appear (e.g., CEO data mart). Figure 2 illustrates all three phases of the DSS architecture.
It is important to understand that the process for migrating off of this architecture is a costly proposition that will only get more expensive and difficult as time goes on. Remember, as with any disease, the earlier it is detected and treatment begins, the sooner the patient will become healthy. However, if treatment is delayed, the patient's condition will worsen and eventually become terminal.
For more information on related topics visit the following related portals...
Data Acquisition, Replication and Data Marts.
David Marco is an internationally recognized expert in the fields of enterprise architecture, data warehousing and business intelligence and is the world's foremost authority on meta data. He is the author of Universal Meta Data Models (Wiley, 2004) and Building and Managing the Meta Data Repository: A Full Life-Cycle Guide (Wiley, 2000). Marco has taught at the University of Chicago and DePaul University, and in 2004 he was selected to the prestigious Crain's Chicago Business "Top 40 Under 40." He is the founder and president of Enterprise Warehousing Solutions, Inc., a GSA schedule and Chicago-headquartered strategic partner and systems integrator dedicated to providing companies and large government agencies with best-in-class business intelligence solutions using data warehousing and meta data repository technologies. He may be reached at (866) EWS-1100 or via e-mail at DMarco@EWSolutions.com.