DM Review Published in DM Review in July 2004.
Printed from

Meta Data & Knowledge Management: Managed Meta Data Environment: A Complete Walk-Through, Part 4

by David Marco

This column is adapted from the book Universal Meta Data Models by David Marco and Michael Jennings (John Wiley & Sons).

Last month's column presented the meta data sourcing layer of a managed meta data environment (MME), along with a walk-through of two of the most common sources of meta data: end users, and documents and spreadsheets. This month's column will walk through the remaining common meta data sources: messaging and transactions, applications, Web sites and e-commerce, and third parties.

Many companies and government agencies are using some form of messaging and transactions, either enterprise application integration (EAI) or XML (sometimes EAI applications use XML), to transfer data from one system to another. The use of EAI and XML is a popular trend as enterprises struggle with the high cost of maintaining current point-to-point approaches to data integration. The problem with point-to-point integration is that the IT environment becomes so complex that it is impossible to manage it effectively or efficiently, especially if you do not have an enterprise level MME. An EAI messaging paradigm should help companies unravel their current point-to-point integration approaches. Figure 1 shows an EAI messaging bus, which provides the technical engine for the EAI messages.

Figure 1: EAI Messaging Bus

While the vast majority of companies are not very advanced in their use and application of EAI and XML, these types of processes can be used to capture highly valuable meta data: business rules, data quality statistics, data lineage, data rationalization processes, etc. Because the EAI tools are designed to manage the messaging bus not the meta data around it, it is important to bring this meta data from the EAI tools into the MME to allow for global access, historical management, publishing and distribution. Without a good MME, it becomes very difficult to maintain these types of applications. Large government organizations and major corporations are using their MMEs to address this challenge.

Within the wide array of applications a corporation uses, some are custom-built by the enterprise's IT department (e.g. data warehouses, general ledger systems, payroll, supply chain management), while others are based on packages (e.g., PeopleSoft, SAP, Siebel). Some may be outsourced or based on an application service provider (ASP) model. This proliferation of applications can be quite voluminous (we know of several corporations and government agencies whose applications number in the thousands).

Each of these applications contains valuable meta data that may need to be extracted and brought into the MME application. Assuming the applications are built on one of the popular relational database platforms (i.e., IBM, Oracle, Microsoft, Sybase, Teradata), the meta data sourcing layer can read the system tables or logs of these databases. There is also considerable meta data stored within these varied applications. Business rules and lookup values are buried within the application code or control tables. In these situations, a process needs to be built to bring in the meta data.

One of the least used sources of meta data is corporate Web sites. Many companies forget the amount of valuable meta data that is contained (or locked away) in HTML on Web sites. For example, healthcare insurance industry analysts need to know the latest information about the testing of a new drug for patient treatments. Research is typically conducted by a doctor working with a hospital. The doctor usually posts his findings to the hospital's Web site or portal; therefore, it's important to capture meta data around these Web sites, such as when the site was updated, what was updated, and so on.

This also applies to e-commerce. When a customer orders a product via the Web, valuable meta data is generated and needs to be captured in the MME.

For many companies, it is a standard business process to interact heavily with third parties. Companies in the banking, national defense-related agencies, healthcare, finance and certain types of manufacturing need to interact with business partners, suppliers, vendors, customers and government or regulatory agencies on a daily basis. For every systematic interaction, these external data sources generate meta data that should be extracted and brought into the MME.1 

See Chapter 2 of Building and Managing the Meta Data Repository (David Marco, Wiley 2000) for a more detailed discussion of external meta data sources.

David Marco is an internationally recognized expert in the fields of enterprise architecture, data warehousing and business intelligence and is the world's foremost authority on meta data. He is the author of Universal Meta Data Models (Wiley, 2004) and Building and Managing the Meta Data Repository: A Full Life-Cycle Guide (Wiley, 2000). Marco has taught at the University of Chicago and DePaul University, and in 2004 he was selected to the prestigious Crain's Chicago Business "Top 40 Under 40."  He is the founder and president of Enterprise Warehousing Solutions, Inc., a GSA schedule and Chicago-headquartered strategic partner and systems integrator dedicated to providing companies and large government agencies with best-in-class business intelligence solutions using data warehousing and meta data repository technologies. He may be reached at (866) EWS-1100 or via e-mail at

Copyright 2005, SourceMedia and DM Review.