Portals eNewsletters Web Seminars dataWarehouse.com DM Review Magazine
DM Review | Covering Business Intelligence, Integration & Analytics
   Covering Business Intelligence, Integration & Analytics Advanced Search
advertisement

RESOURCE PORTALS
View all Portals

WEB SEMINARS
Scheduled Events

RESEARCH VAULT
White Paper Library
Research Papers

CAREERZONE
View Job Listings
Post a job

Advertisement

INFORMATION CENTER
DM Review Home
Newsletters
Current Magazine Issue
Magazine Archives
Online Columnists
Ask the Experts
Industry News
Search DM Review

GENERAL RESOURCES
Bookstore
Buyer's Guide
Glossary
Industry Events Calendar
Monthly Product Guides
Software Demo Lab
Vendor Listings

DM REVIEW
About Us
Press Releases
Awards
Advertising/Media Kit
Reprints
Magazine Subscriptions
Editorial Calendar
Contact Us
Customer Service

Meta Data & Knowledge Management:
Managed Meta Data Environment: A Complete Walk-Through, Part 2

  Column published in DM Review Magazine
May 2004 Issue
 
  By David Marco

Author's Note: This column is the second in an eight-part series adapted from the book Universal Meta Data Models by David Marco and Michael Jennings (John Wiley & Sons).

Last month, I presented the managed meta data environment (MME) and its six components (meta data sourcing layer, meta data integration layer, meta data repository, meta data management layer, meta data marts and meta data delivery layer). In this column, I will discuss the meta data sourcing layer and begin to discuss the most common sources of meta data targeted by this layer.

Meta Data Sourcing Layer

The purpose of the meta data sourcing layer, the first component of the MME architecture, is to extract meta data from its source and send it into the meta data integration layer or directly into the meta data repository (see Figure 1). Some meta data will be accessed by the MME through the use of pointers (distributed) that will present the meta data to the end user at the time that it is requested. The pointers are managed by the meta data sourcing layer and stored in the meta data repository.

It is best to send the extracted meta data to the same hardware location as the meta data repository. Often, meta data architects incorrectly build meta data integration processes on the platform from which the meta data is sourced (other than record selection, which is acceptable). This merging of the meta data sourcing layer with the meta data integration layer is a common mistake that causes a host of issues.

As sources of meta data are changed and added (and they will be), the meta data integration process is negatively impacted. Separating the meta data sourcing layer from the meta data integration layer only impacts the meta data sourcing layer. By keeping all of the meta data together on the target platform, the meta data architect can adapt the integration processes much more easily.


Figure 1: Meta Data Sourcing Layer

Keeping the extraction layer separate from the sourcing layer provides a tidy backup and restart point. Meta data loading errors typically occur in the meta data transformation layer. Without the extraction layer, if an error occurred, the architect would be required to go back to the source of the meta data and re-read it. This can cause a number of problems. If the source of meta data has been updated, it may become out of sync from some of the other sources of meta data with which it integrates. Also, the meta data source may currently be in use, and this processing could impact the performance of the meta data source. The golden rule of meta data extraction is: Never have multiple processes extracting the same meta data from the same meta data source.

In these situations, the timeliness and, consequently, the accuracy of the meta data can be compromised. For example, suppose that you have built one meta data extraction process (Process #1) that reads physical attribute names from a modeling tool's tables to load a target entity in the meta model table that contains physical attribute names. You also built a second process (Process #2) to read and load attribute domain values. It is possible that the attribute table in the modeling tool has been changed between the running of Process #1 and Process #2. This situation would cause the meta data to be out of sync.

This situation can also cause unnecessary delays in the loading of the meta data with meta data sources that have limited availability/batch windows. For example, if you were reading database logs from your enterprise resource planning (ERP) system, you would not want to run multiple extraction processes on these logs because they most likely have a limited amount of available batch windows. While this doesn't happen often, there is no reason to build unnecessary flaws into your meta data architecture.

The number and variety of meta data sources will vary greatly based on the business requirements of your MME. While there are some common sources of meta data, I've never seen two meta data repositories with exactly the same meta data sources. The most common meta data sources are software tools, end users, documents and spreadsheets, messaging and transactions, applications, Web sites and e-commerce, and third parties.

Software Tools

A great deal of valuable meta data is stored in various software tools. Keep in mind that many of these tools have internal meta data repositories designed to enable the tool's specific functionality. Typically, they are not designed to be accessed by meta data users or integrated into other sources of meta data. You will need to set up processes to go into these tools' repositories and pull the meta data out.

Of these tools, relational databases and modeling tools are the most common sources of meta data for the meta data sourcing layer. The MME usually reads the database's system tables to extract meta data about physical column names, logical attribute names, physical table names, logical entity names, relationships, indexing, change data capture and data access.

Part 3 of this series will continue to walk through the sources of meta data targeted by the meta data extraction layer.

...............................................................................

For more information on related topics visit the following related portals...
Data Modeling.

David Marco is an internationally recognized expert in the fields of enterprise architecture, data warehousing and business intelligence and is the world's foremost authority on meta data. He is the author of Universal Meta Data Models (Wiley, 2004) and Building and Managing the Meta Data Repository: A Full Life-Cycle Guide (Wiley, 2000). Marco has taught at the University of Chicago and DePaul University, and in 2004 he was selected to the prestigious Crain's Chicago Business "Top 40 Under 40."  He is the founder and president of Enterprise Warehousing Solutions, Inc., a GSA schedule and Chicago-headquartered strategic partner and systems integrator dedicated to providing companies and large government agencies with best-in-class business intelligence solutions using data warehousing and meta data repository technologies. He may be reached at (866) EWS-1100 or via e-mail at DMarco@EWSolutions.com.

Solutions Marketplace
Provided by IndustryBrains

Data Validation Tools: FREE Trial
Protect against fraud, waste and excess marketing costs by cleaning your customer database of inaccurate, incomplete or undeliverable addresses. Add on phone check, name parsing and geo-coding as needed. FREE trial of Data Quality dev tools here.

Speed Databases 2500% - World's Fastest Storage
Faster databases support more concurrent users and handle more simultaneous transactions. Register for FREE whitepaper, Increase Application Performance With Solid State Disk. Texas Memory Systems - makers of the World's Fastest Storage

Manage Data Center from Virtually Anywhere!
Learn how SecureLinx remote IT management products can quickly and easily give you the ability to securely manage data center equipment (servers, switches, routers, telecom equipment) from anywhere, at any time... even if the network is down.

Design Databases with ER/Studio: Free Trial
ER/Studio delivers next-generation data modeling. Multiple, distinct physical models based on a single logical model give you the tools you need to manage complex database environments and critical metadata in an intuitive user interface.

Free EII Buyer's Guide
Understand EII - Trends. Tech. Apps. Calculate ROI. Download Now.

Click here to advertise in this space


View Full Issue View Full Magazine Issue
E-mail This Column E-Mail This Column
Printer Friendly Version Printer-Friendly Version
Related Content Related Content
Request Reprints Request Reprints
Advertisement
advertisement
Site Map Terms of Use Privacy Policy
SourceMedia (c) 2006 DM Review and SourceMedia, Inc. All rights reserved.
SourceMedia is an Investcorp company.
Use, duplication, or sale of this service, or data contained herein, is strictly prohibited.