Portals eNewsletters Web Seminars dataWarehouse.com DM Review Magazine
DM Review | Covering Business Intelligence, Integration & Analytics
   Covering Business Intelligence, Integration & Analytics Advanced Search

View all Portals

Scheduled Events

White Paper Library
Research Papers

View Job Listings
Post a job


DM Review Home
Current Magazine Issue
Magazine Archives
Online Columnists
Ask the Experts
Industry News
Search DM Review

Buyer's Guide
Industry Events Calendar
Monthly Product Guides
Software Demo Lab
Vendor Listings

About Us
Press Releases
Advertising/Media Kit
Magazine Subscriptions
Editorial Calendar
Contact Us
Customer Service

Knowledge: The Essence of Meta:
Meta Data Management Life Cycle Reviewed

online columnist R. Todd Stephens, Ph.D.     Column published in DMReview.com
December 18, 2003
  By R. Todd Stephens, Ph.D.

Meta data does not just appear out of nowhere nor does it just fade away mystically. Meta data is managed around the life of an asset. One of the hard lessons we have learned over the past few years is that the value of meta data is slowly degraded over time for various reasons such as quality, staleness, lack of use, etc. We can loosely define an asset as any person, place or thing within the technological community. Examples of assets include databases, logical models, physical models, XML structures, components, documents, metrics, systems, interfaces, etc. Figure 1 provides a high-level view of the meta data management life cycle around an asset.


Figure 1: Meta Data Management Life Cycle

The asset itself can be described as a container of data, information, knowledge and/or wisdom that needs to be surgically removed. There is a process that will acquire the meta data information from the asset. This process can be an automated extraction process or done by hand. Performing the data load by hand can be used in conjunction with a extraction utility and in most cases is required in order to fill in the information gaps. A third option may be fairly obvious and that is to integrate a tool or collection of tools into the system development life cycle. This would solve 90 percent of the issues we have and push meta data into the most active role possible. However, in a large enterprise the environment is not as homogeneous has people would lead us to believe. In addition, the odds are that the majority of the technology you have is not built upon a current set of standards which makes the automatic enterprise integration nearly impossible, if not extremely expensive. While these processes are fairly well known and documented by various publications, the next series of steps is a source of much confusion and strife.

The divergence of thought comes from the value generated from the passive and active utility built around the asset. Passive utility can be defined as the utility of publishing, indexing, searching and result generation of meta data information. Many experts argue about the limited value of the passive meta data, but we have many examples of where this type of utility is not only valued but demanded. It is widely recognized that an organization's most valuable knowledge, its essential intellectual capital, is not limited to information contained in official document repositories and databases - scientific formulae, "hard" research data, computer code, codified procedures, financial figures, customer records, and the like. (Bobrow & Whalen, 2003). However, in order to develop the know-how, ideas and insights of the community at large, meta data must be managed at every stage of the asset. Since passive utility is the discovery and knowledge based reuse of meta data information, it stands to reason that passive utility must be delivered first. Active utility without information is simply pointless. When you review Figure 1, it should become apparent the importance of getting the meta data right.

Getting it right means that getting accurate, complete and contextual information from an asset and then providing access to this information across the organization.

Is active utility a bad thing? On the contrary, active utility is like shooting your age in golf or winning the tournament. The hours and hours of hitting practice balls is the passive data collection activity. The payoff and the glory come from the active utility. Our only point is that you can't have the latter without the former. Active utility is simply taking the meta data information and creating a new value proposition for the business or technical community. Some examples of active utility include:

  • Impact analysis across the asset population
  • Cross-reference and implied/derived meaning
  • Dynamic data exchange (XML)
  • Real-time metrics
  • Web services and the utilization of meta data-driven architectures
  • Dynamic reuse of asset information (i.e., screen/report field lookup)
  • XML file validation using DTD and schemas

In fact, active utilization may create a new asset in the form of new functionality. For example, providing the ability to cross search an asset collection is analogous to bundling products that deliver new utility to the customer. Hence, the return arrow back to the asset inventory for the active utility process. (See Figure 1.) Therefore, not only can a meta data services group catalog technical assets, they can also create them.

The final area of Figure 1 is the information decay arrow. What this means is that information that stays within the repository will decay indicating that the accuracy of the data is only 100 percent valid for a period of time. Why? The most obvious reason is that the technological community is constantly changing. Even the low-level data constructs are changing. Suppose we take a snapshot of the logical, physical and operating system view of a database. How long will this snapshot be accurate? Perhaps a better question is that how long before the next DBA modifies the data structure or the modeler updates the text on a field? What we do know is that the longer information sits in a repository the greater chance that this information is not only inaccurate but could lead to erroneous decisions from the end user perspective. A content aging strategy should be a part of every meta data implementation. Content aging simply provides the administrator with which information hasn't been updated in the past 30 days or what ever time period is appropriate to the business. Contacts can then be made to determine if the information is to be removed or updated.

What a great research question: What is the rate of decay for information? Think about information collected on you: address, credit score, medical, etc. You, as a human being, are constantly changing and, therefore, the information about you is constantly changing. Your tastes, goals and plans are changing as you move into different stages of life. So, we know information decays but at what rate? Perhaps there is a data half life out there waiting to be discovered. While we are not sure what the rate of information decay is, we can slow the rate down by increasing the usage of meta data information in both passive and active frameworks. In real estate, the three most important words are location, location, location. In meta data, it's quality, quality, quality. In that order!

It takes a great deal of both individual and collective energy to have a successful year in organizations like yours and mine, especially in the world of data architecture. As the year comes to a close, I want to thank all of you for your support for the concepts of meta data, data resource management, architecture and the like. We're looking forward to another year of excitement, change and opportunities. It is the body of knowledge that we are expanding and perhaps there is no greater calling within the technical community. Thank you for your help and best wishes for a happy, prosperous and healthy new year.


For more information on related topics visit the following related portals...
Meta Data.

R. Todd Stephens, Ph.D. is the director of Meta Data Services Group for the BellSouth Corporation, located in Atlanta, Georgia. He has more than 20 years of experience in information technology and speaks around the world on meta data, data architecture and information technology. Stephens recently earned his Ph.D. in information systems and has more than 70 publications in the academic, professional and patent arena. You can reach him via e-mail at Todd@rtodd.com or to learn more visit http://www.rtodd.com/.

Solutions Marketplace
Provided by IndustryBrains

Design Databases with ER/Studio: Free Trial
ER/Studio delivers next-generation data modeling. Multiple, distinct physical models based on a single logical model give you the tools you need to manage complex database environments and critical metadata in an intuitive user interface.

Data Quality Tools, Affordable and Accurate
Protect against fraud, waste and excess marketing costs by cleaning your customer database of inaccurate, incomplete or undeliverable addresses. Add on phone check, name parsing and geo-coding as needed. FREE trial of Data Quality dev tools here.

Free EII Buyer's Guide
Understand EII - Trends. Tech. Apps. Calculate ROI. Download Now.

dotDefender protects sites against Web attacks
30-day evaluation period for dotDefender, a high-end cost-effective security solution for web servers that protects against a broad range of attacks, is now available. dotdefender supports Apache, IIS and iPlanet Web servers and all Linux OS's.

Click here to advertise in this space

E-mail This Column E-Mail This Column
Printer Friendly Version Printer-Friendly Version
Related Content Related Content
Request Reprints Request Reprints
Site Map Terms of Use Privacy Policy
SourceMedia (c) 2006 DM Review and SourceMedia, Inc. All rights reserved.
SourceMedia is an Investcorp company.
Use, duplication, or sale of this service, or data contained herein, is strictly prohibited.