Portals eNewsletters Web Seminars dataWarehouse.com DM Review Magazine
DM Review | Covering Business Intelligence, Integration & Analytics
   Covering Business Intelligence, Integration & Analytics Advanced Search

View all Portals

Scheduled Events
Archived Events

White Paper Library
Research Papers

View Job Listings
Post a job


DM Review Home
Current Magazine Issue
Magazine Archives
Online Columnists
Ask the Experts
Industry News
Search DM Review

Buyer's Guide
Industry Events Calendar
Monthly Product Guides
Software Demo Lab
Vendor Listings

About Us
Press Releases
Advertising/Media Kit
Magazine Subscriptions
Editorial Calendar
Contact Us
Customer Service

Meta Data and Data Administration:
XML's Uses In Data Warehousing: Getting Data In

  Column published in DM Review Magazine
June 2000 Issue
  By David Marco

Over the past 10 years, data warehousing has proven to be a highly valuable technology that the vast majority of corporations have leveraged to provide them with a competitive edge in the marketplace. As we enter the next decade, extensible markup language (XML) is poised to accomplish much the same. The one unanswered question is how will these two essential technologies function together.

Virtually all Web sites have been built with hypertext markup language (HTML), which describes how data will be formatted but does not provide information on this data. Consequently, this unstructured Web-site data is very difficult to bring into a data warehouse system. XML provides a remedy to this situation by assigning data tags to this Web-site information. To understand how these data tags function let's use XML to describe the information about a textbook:

Building and Managing the Meta Data Repository

David Marco

John Wiley & Sons
New York

By adding context to the content on a Web site, XML enables corporations to bring unstructured, Web-site data into their data warehouses. This is critical for many companies' analysts who need this information to make better decisions. Let's walk through an example using a healthcare company. Many doctors that research drugs will publish their results to their Web sites. Often the decision-makers in these healthcare organizations want to know about the latest developments with this drug research in order to make better patient- care decisions. To see how XML simplifies this challenge, we will examine Figure 1.

Figure 1: XML Bringing Data into the Data Warehouse

Figure 1 illustrates data being read from a physician's Web site and brought into a XML transformation process (see Figure 1, bullet 1). This transformation process (bullet 3) matches the Web-site data to the corresponding XML schema (data tag layout). Remember that one of the key challenges for XML is to standardize on the names and meaning of the data tags. As an industry, IT has had limited success in defining global standards, and I don't expect XML to change this trend. Therefore, we will have to juggle multiple XML schemas in our corporations. Next, the XML transformation process converts the tagged Web-site data into record format by removing the XML data tags which is important since these tags increase processing overhead. These records are sent to the extraction, transformation and load (ETL) process of the data warehouse (bullet 4). The ETL process will clean, integrate and load this data into the data warehouse and its corresponding data marts (bullet 5). Keep in mind that as several ETL tool vendors are looking to expand their current toolsets to include XML transformation functionality. This XML transformation process (bullet 3) could be completely merged into the ETL process.

Often times when we think of the Internet we think about business-to- customer (B2C) transactions; however, the potential for business-to-business (B2B) commerce on the Internet is far greater than that of B2C. Many companies are in the business of selling information. XML plays a major role in this effort as it allows B2B transactions to be brought directly into a data warehouse. Figure 1, bullet 2 shows how the B2B trading partner sends information into the XML transformation process. As before, not all B2B trading partners will use the standard XML schemas so multiple XML schemas will need to be maintained. This process (bullet 3) uses the XML schemas stored in the XML database and moves these converted transactions into the ETL process of the data warehouse (bullet 4). The ETL process then integrates this information into the data warehouse and its data marts (bullet 5).

As we can see, XML is critical technology and it is coming to a data warehouse near you!


For more information on related topics visit the following related portals...
Meta Data and XML.

David Marco is an internationally recognized expert in the fields of enterprise architecture, data warehousing and business intelligence and is the world's foremost authority on meta data. He is the author of Universal Meta Data Models (Wiley, 2004) and Building and Managing the Meta Data Repository: A Full Life-Cycle Guide (Wiley, 2000). Marco has taught at the University of Chicago and DePaul University, and in 2004 he was selected to the prestigious Crain's Chicago Business "Top 40 Under 40."  He is the founder and president of Enterprise Warehousing Solutions, Inc., a GSA schedule and Chicago-headquartered strategic partner and systems integrator dedicated to providing companies and large government agencies with best-in-class business intelligence solutions using data warehousing and meta data repository technologies. He may be reached at (866) EWS-1100 or via e-mail at DMarco@EWSolutions.com.

Solutions Marketplace
Provided by IndustryBrains

Design Databases with ER/Studio: Free Trial
ER/Studio delivers next-generation data modeling. Multiple, distinct physical models based on a single logical model give you the tools you need to manage complex database environments and critical metadata in an intuitive user interface.

Q: Best Data Warehouse Strategey? A: Pre-built DW
Free White paper describes how packaged analytics based on a pre-built data warehouse Lower TCO, Lower Risk, Increase Success, and deliver Real Results Faster.

Business Intelligence Solutions
Unlock ERP systems and discover insight into your business. Pre built solutions from Jaros put real-time reporting and historical, trend, and analytic information at your fingertips.

Data Mining: Levels I, II & III
Learn how experts build and deploy predictive models by attending The Modeling Agency's vendor-neutral courses. Leverage valuable information hidden within your data through predictive analytics. Click through to view upcoming events.

Free EII Buyer's Guide
Understand EII - Trends. Tech. Apps. Calculate ROI. Download Now.

Click here to advertise in this space

View Full Issue View Full Magazine Issue
E-mail This Column E-Mail This Column
Printer Friendly Version Printer-Friendly Version
Related Content Related Content
Request Reprints Request Reprints
Site Map Terms of Use Privacy Policy
SourceMedia (c) 2006 DM Review and SourceMedia, Inc. All rights reserved.
SourceMedia is an Investcorp company.
Use, duplication, or sale of this service, or data contained herein, is strictly prohibited.