Portals eNewsletters Web Seminars dataWarehouse.com DM Review Magazine
DM Review | Covering Business Intelligence, Integration & Analytics
   Covering Business Intelligence, Integration & Analytics Advanced Search

View all Portals

Scheduled Events

White Paper Library
Research Papers

View Job Listings
Post a job


DM Review Home
Current Magazine Issue
Magazine Archives
Online Columnists
Ask the Experts
Industry News
Search DM Review

Buyer's Guide
Industry Events Calendar
Monthly Product Guides
Software Demo Lab
Vendor Listings

About Us
Press Releases
Advertising/Media Kit
Magazine Subscriptions
Editorial Calendar
Contact Us
Customer Service

Meta Data Repositories: Where We've Been and Where We're Going

  Article published in DM Review Magazine
February 2002 Issue
  By David Marco

Many people believe that meta data and meta data repositories are new concepts; however, their origins date back to the early 1970s, or the first days of computing. When we first started building computer systems, we realized that there was a "bunch of stuff" (knowledge) that was absolutely necessary for building, using and maintaining information technology (IT) systems. We learned very quickly that meta data existed throughout all of our organizations (see Figure 1). Meta data is stored in our systems, technical processes, business processes, policies and people. Essentially, we knew that we had no place to put any of this information (meta data). At this point, we realized that we needed data about the data that we were using in our computer systems.

Figure 1: Meta Data Points

Early Commercial Products

When the first commercial meta data repositories appeared in the mid-1970s, they were called "data dictionaries." These data dictionaries were very data-focused and less knowledge-focused. They provided a centralized repository of information about data such as meaning, relationships, origin, domain, usage and format. Their purpose was to assist database administrators (DBAs) in planning, controlling and evaluating the collection, storage and use of data. One of the challenges that meta data repositories have today is differentiating themselves from data dictionaries. While meta data repositories perform all of the functions of a data dictionary, their scope is far greater. The early meta data repositories (data dictionaries) were mainly used for defining requirements, corporate data modeling, COBOL (common business-oriented language) and PL/1 (programming language one) data definition generation and database support (see Figure 2).

Figure 2: 1970s - - Data Dictionaries Masquerading as Repositories

Later, a new phenomenon entered the world of IT and forever changed it - the personal computer (PC). When PCs burst onto the business scene, they changed the way companies worked and fueled tremendous gains in productivity. CASE (computer- aided software engineering) was one of the productivity gains. CASE tools are software applications that automate the process of designing databases, applications and software implementation. These design and construction tools stored data about the data (meta data) that they managed (see Figure 3).

Figure 3: 1980s -- CASE Tool-Based Repositories

It didn't take long before the users of CASE tools started asking their vendors to build interfaces to link the meta data from various CASE tools together. The CASE tool vendors were reluctant to build these interfaces because they believed that their tools' repositories could provide all of the necessary functionality; and, understandably, they didn't want companies to be able to easily migrate from their tool to a competitor's tool. Nevertheless, some interfaces were built either using vendor tools or dedicated interface tools.

In 1987, the need for CASE-tool integration triggered the Electronic Industries Alliance (EIA) to begin working on a CASE data interchange format (CDIF) which attempted to tackle the problem by defining meta models for specific CASE tool subject areas by means of an object-oriented entity relationship modeling technique. In many ways, the CDIF standards came too late for the CASE tool industry.

In the 1980s, several companies including IBM announced mainframe-based meta data repository tools. These efforts were the first meta data initiatives; however, their scope was limited to technical meta data and almost completely ignored business meta data. Most of these early meta data repositories were just glamorized data dictionaries, intended - like the earlier data dictionaries - for use by DBAs and data modelers. In addition, the companies that created these repositories did little to educate their clients in the use of these tools. Few companies saw much value in these early repository applications.

In the 1990s, decision support emerged and soon convinced business managers of the value of a meta data repository, expanding the scope of the early repository efforts well beyond that of data dictionaries.

Figure 4: 1990s - - Decision Support Meta Data Repositories

The meta data repositories of the 1990s featured a client/server paradigm as opposed to the traditional mainframe platform. The mainframe vendors viewed these new repositories as a threat because they greatly eased the task of migrating from a mainframe environment to the new and popular client/server architecture. The multiplicity of decision support tools requiring access to meta data reawakened the slumbering repository market. Vendors such as Rochade, RELTECH Group and BrownStone Solutions were quick to jump into the fray with new repository products. Many older, established computing companies recognized the market potential and attempted, sometimes successfully, to buy their way in by acquiring these pioneer repository vendors. For example, Platinum Technologies purchased RELTECH, BrownStone and LogicWorks, and was then swallowed by Computer Associates in 1999.

Where Are We Headed?

Currently meta data management and meta data repository development are in a stage very similar to data warehousing in the early 1990s. In the early 1990s, people such as Bill Inmon were articulating the value of building data warehouses. At that time, companies were beginning to listen and starting to invest in data warehousing. Meta data repositories are moving in much the same direction today. In fact, at Enterprise Warehousing Solutions (EWS), we are doing more meta data repository development now than at any other point in our company's history. Companies are beginning to realize that they need to make significant investments in their repositories in order for their systems to provide value.

All corporations are becoming more intelligent. Businesses realize that to attain a competitive advantage, they need their IT systems to manage more than just their data; they must manage their knowledge (meta data). As a corporation's IT systems mature, they progress from collecting and managing data to collecting and managing knowledge. Knowledge is a company's most valuable asset, and a meta data repository is the key to managing a company's corporate knowledge (for more information on this topic see my column, "A Meta Data Repository Is The Key To Knowledge Management," DM Review, December 2000).

Maturing Products

There has been no tougher critic of the meta data integration vendors than myself, and I still believe that these vendors are neglecting their most important user: the business user. With that said, in the past year I also have seen across-the-board improvements by almost all of the vendors in this area. New vendors such as Data Advantage Group are entering the meta data integration scene with new and exciting products. In addition, the more traditional meta data repository vendors such as Computer Associates and Allen Systems Group have all dramatically improved their product lines.

Figure 5: Meta Data Integration Vendors

Approximately nine months ago I was asked to speak to a group of approximately 15 IT senior vice presidents of banks. Their number-one technology issue was meta data! When I spoke on meta data many years ago, we were lucky to have 15 IT developers in a talk. In most Fortune 500 companies, massive amounts of redundant data (I have experienced that the average company has fourfold unnecessary data redundancy), needlessly redundant systems and tremendous data quality problems exist. Fortunately, executive management is starting to realize that these problems result in a tremendous cost drain for their companies. These same people are looking to control the costs of their IT departments through the use of meta data repositories. As a result, meta data repositories and meta data management are continuing to move up corporations' IT priority lists.


For more information on related topics visit the following related portals...
DW Design, Methodology.

David Marco is an internationally recognized expert in the fields of enterprise architecture, data warehousing and business intelligence and is the world's foremost authority on meta data. He is the author of Universal Meta Data Models (Wiley, 2004) and Building and Managing the Meta Data Repository: A Full Life-Cycle Guide (Wiley, 2000). Marco has taught at the University of Chicago and DePaul University, and in 2004 he was selected to the prestigious Crain's Chicago Business "Top 40 Under 40."  He is the founder and president of Enterprise Warehousing Solutions, Inc., a GSA schedule and Chicago-headquartered strategic partner and systems integrator dedicated to providing companies and large government agencies with best-in-class business intelligence solutions using data warehousing and meta data repository technologies. He may be reached at (866) EWS-1100 or via e-mail at DMarco@EWSolutions.com.

Solutions Marketplace
Provided by IndustryBrains

Data Validation Tools: FREE Trial
Protect against fraud, waste and excess marketing costs by cleaning your customer database of inaccurate, incomplete or undeliverable addresses. Add on phone check, name parsing and geo-coding as needed. FREE trial of Data Quality dev tools here.

Speed Databases 2500% - World's Fastest Storage
Faster databases support more concurrent users and handle more simultaneous transactions. Register for FREE whitepaper, Increase Application Performance With Solid State Disk. Texas Memory Systems - makers of the World's Fastest Storage

Manage Data Center from Virtually Anywhere!
Learn how SecureLinx remote IT management products can quickly and easily give you the ability to securely manage data center equipment (servers, switches, routers, telecom equipment) from anywhere, at any time... even if the network is down.

Design Databases with ER/Studio: Free Trial
ER/Studio delivers next-generation data modeling. Multiple, distinct physical models based on a single logical model give you the tools you need to manage complex database environments and critical metadata in an intuitive user interface.

Free EII Buyer's Guide
Understand EII - Trends. Tech. Apps. Calculate ROI. Download Now.

Click here to advertise in this space

View Full Issue View Full Magazine Issue
E-mail This Article E-Mail This Article
Printer Friendly Version Printer-Friendly Version
Related Content Related Content
Request Reprints Request Reprints
Site Map Terms of Use Privacy Policy
SourceMedia (c) 2006 DM Review and SourceMedia, Inc. All rights reserved.
SourceMedia is an Investcorp company.
Use, duplication, or sale of this service, or data contained herein, is strictly prohibited.