Portals eNewsletters Web Seminars dataWarehouse.com DM Review Magazine
DM Review | Covering Business Intelligence, Integration & Analytics
   Covering Business Intelligence, Integration & Analytics Advanced Search
advertisement

RESOURCE PORTALS
View all Portals

WEB SEMINARS
Scheduled Events

RESEARCH VAULT
White Paper Library
Research Papers

CAREERZONE
View Job Listings
Post a job

Advertisement

INFORMATION CENTER
DM Review Home
Newsletters
Current Magazine Issue
Magazine Archives
Online Columnists
Ask the Experts
Industry News
Search DM Review

GENERAL RESOURCES
Bookstore
Buyer's Guide
Glossary
Industry Events Calendar
Monthly Product Guides
Software Demo Lab
Vendor Listings

DM REVIEW
About Us
Press Releases
Awards
Advertising/Media Kit
Reprints
Magazine Subscriptions
Editorial Calendar
Contact Us
Customer Service

Knowledge: The Essence of Meta Data:
The Five Disciplines of Data

online columnist R. Todd Stephens, Ph.D.     Column published in DMReview.com
February 19, 2004
 
  By R. Todd Stephens, Ph.D.

Several months ago, I was addressing a group at a data conference in Philadelphia. At the end of the session, a gentleman approached me and said, "You should change the name of your group, you guys are doing much more than meta data." Change the name of our group? Change the name of the Meta Data Services Group (MSG)? MSG is our brand, MSG is our passion, MSG is what we are all about. That being said, he had a point that I contemplated all the way back to Atlanta. I realized that the walls of meta data are very short indeed. Meta data spills over into a wide variety of subjects, concepts and disciplines. This neighborhood of philosophies comes together to form the five disciplines of data. Figure 1 provides a very high-level view of these disciplines. Keep in mind that I am looking at these subject areas through my rose covered meta-glasses; you will notice the study of structure is in the point position.


Figure 1: Five Disciplines of Data

The five disciplines of data include:

  1. Meta Data Architecture: The Study of Structure
  2. Data Architecture: The Study of Data Resource Management
  3. Information Architecture: The Study of Human/Computer Interaction
  4. Knowledge Management: The Study of Un-Structure
  5. Content Management: The Study of the Lifecycle of Data

Meta Data Architecture

The study of structure is a rather broad definition that focuses on the basic construct of information. Enterprise meta data has received a lot of attention and press over the past few years, as many organizations have attempted to push data warehouse success to the enterprise level. Unfortunately the integrated tools, controlled environment and high degree of quality assurance are much harder to find at the enterprise level. Enterprise data architects are looking for alternative methods of data integration. Perhaps, enterprise meta data holds the key. Historically, definitions of meta data have described structural aspects of mechanisms that house data such as table, column and attribute characteristics of a relational database management system (RDBMS). Additionally, at times, meta data, as a concept, has included descriptions of the state of data and enumerating and summarizing various views of data.

Today, with the advent of technologies such as hypermedia and heuristically based searching and indexing, a new, broader, more generic definition of meta data is needed. This definition should include the traditional concepts, but it should add the concepts of existence, perspective, modeling and topicality. A new definition should recognize that much, if not most, of enterprise data is not found in traditional RDBMs, but rather, it is found in the myriad technological assets and views of those assets that exist at any point in time. My definition of meta data is as follows:

"Meta data is structured, semi-structured and unstructured data which describes the characteristics of a resource (external source) or asset (internal source). Meta data is about knowledge, which is the ability to turn information and data into effective action."

Now any definition without context is not of much use, especially, one that covers a wide spectrum of concepts. However, the basic belief is that structure is at the heart of our ability to understand complex thoughts, including those of information.

Data Architecture: The Study of Data Resource Management

Data is one of the top assets of a corporation. Many decision-makers require only "data and fact" to solve problems a corporation faces. However, these decision-makers cannot easily make decisions when there are inconsistencies in data. Therefore, the goal of data architecture is to provide the framework where data quality is improved by managing data as an asset.

Data architecture is the blueprint for the lifecycle of business information. The lifecycle includes the structure of the information (as represented by data models and databases) as well as the business events and movement of the information. This view of data provides a better foundation for transforming a silo-oriented set of existing applications into systems that fit into the organizational model.

Data architecture is based on business goals and objectives, technical goals and objectives, and the desired state definitions of the other technical (e.g., platform) and functional architectures (i.e., applications). It is the corporation's expression of strategy for creating and managing the use of data in order to transform data into information. Recognizing that data is a strategic asset that is expensive to handle and easy to waste, the data architecture must assure:

  • Standardization of data structures (logical and physical).
  • Definition and protection of the data resource.
  • Consistency and quality of the data resource.
  • Judicial use of corporate resources (e.g., personnel) in managing the data asset.
  • That credible and timely data is delivered throughout the enterprise at a reasonable cost.
  • That it can be driven by the needs of business and can adapt to changes in the business.
  • That it can be implemented in any technical environment.

The data architecture focuses on the data resource management of information which should include data quality, data content, data usage, data access, data storage management and data modeling. As you can see from the first two areas, these dimensions overlap each other greatly. Don't worry this blurriness gets worse as we bring in the next three areas.

Information Architecture: The Study of Human/Computer Interaction

With my deepest apologies to the human computer interaction (HCI) industry, I am going to simplify the entire field of study into the concepts of information architecture. The Asilomar Institute for Information Architecture defines information architecture as the structural design of shared information environments that include the art and science of organizing and labeling Web sites, intranets, online communities and software to support usability and findability. Information architecture is an emerging community of practice focused on bringing principles of design and architecture to the digital landscape. The information architecture project at MIT seeks to create information spaces, where people will use this awareness to search, browse and learn. In the same way that they navigate in the physical environment, they will navigate through knowledge. Take a look at my article published in The Data Administration Newsletter (TDAN.com) which discusses, in exhaustive detail, the concepts around usability and their importance.

Knowledge Management: The Study of Un-Structure

Knowledge management with the additional practice of capturing the tacit experience of the individual should be shared, used and built upon by the organization leading to increased productivity. The study of un-structure provides us with the means of capturing, storing and disseminating knowledge through out the organization.

Knowledge management starts with the individual and moves through an organization. Every individual uses knowledge management tools - including personal memory, date books, notebooks, file cabinets, e-mail archives, calendars, post-it notes, bulletin boards, newsletters, journals and restaurant napkins. Knowledge management begins when an organization enables individuals to link their personal knowledge management systems with organizational knowledge management systems. (Richardson, 2001). Knowledge is about the "how," and much of this information is stored in various objects as described above.

Content Management: The Study of the Lifecycle of Data

Content management is simply the lifecycle of data, information and knowledge. Content management can mean different things to different people. Fundamentally it is the process of attaining content, working with, publishing and retiring it. In its most basic form, content management systems should allow each content producer to create assets and feed them to the publishing system. The system should have customized and automated checks and balances to ensure that information gets placed correctly, that navigation trees are created and maintained, and that the appropriate people control the process along the way. To make this happen, good content management packages separate content (written material, images, streaming audio and anything else that makes up an asset) from presentation of content, and they include strong workflow capabilities (Wolf, 2003).

Convergence of Disciplines

When I showed Figure 1 to a couple of other knowledge management architects they responded with the expected groans and body language that shouted disapproval. Then, I walked through an example of how we might build a repository and the impact of each area. Clearly, a repository will contain the general structured information that describes the asset (i.e., Dublin Core), the object specific structured information (i.e., OMG) and the myriad of relationships that surround the asset (meta data architecture). This information will be brought in, categorized, utilized and then retired as new and updated information is loaded (content management). We will add supporting documents, scan schedules and user guides as a collection of unstructured information sources (knowledge management). The underlying meta-model will be modeled using our entity relationship tool and the data will be sorted into specific categories as defined by the repository librarian (data architecture). Finally, the repository itself will be designed based on a solid set of human computer interaction principles (information architecture). As meta data professionals, we cannot ignore the other disciplines and more importantly we should embrace their utility and value.

By the end of the conversation, the only thing they wanted to change was to switch meta data architecture and knowledge management as the center of the disciplines of data. Sorry, I only have one pair of rose colored meta- glasses.

Do you still see meta data as "data about data?" Can you imagine meta data as one of the core disciplines of data? Better yet, do you see why the study of structure is at the center of the model? The reality is that without structure the other four are meaningless!

...............................................................................

For more information on related topics visit the following related portals...
Meta Data.

R. Todd Stephens, Ph.D. is the director of Meta Data Services Group for the BellSouth Corporation, located in Atlanta, Georgia. He has more than 20 years of experience in information technology and speaks around the world on meta data, data architecture and information technology. Stephens recently earned his Ph.D. in information systems and has more than 70 publications in the academic, professional and patent arena. You can reach him via e-mail at Todd@rtodd.com or to learn more visit http://www.rtodd.com/.

Solutions Marketplace
Provided by IndustryBrains

Data Quality Tools, Affordable and Accurate
Protect against fraud, waste and excess marketing costs by cleaning your customer database of inaccurate, incomplete or undeliverable addresses. Add on phone check, name parsing and geo-coding as needed. FREE trial of Data Quality dev tools here.

Design Databases with ER/Studio: Free Trial
ER/Studio delivers next-generation data modeling. Multiple, distinct physical models based on a single logical model give you the tools you need to manage complex database environments and critical metadata in an intuitive user interface.

Free EII Buyer's Guide
Understand EII - Trends. Tech. Apps. Calculate ROI. Download Now.

Data Mining: Levels I, II & III
Learn how experts build and deploy predictive models by attending The Modeling Agency's vendor-neutral courses. Leverage valuable information hidden within your data through predictive analytics. Click through to view upcoming events.

Click here to advertise in this space


E-mail This Column E-Mail This Column
Printer Friendly Version Printer-Friendly Version
Related Content Related Content
Request Reprints Request Reprints
Advertisement
advertisement
Site Map Terms of Use Privacy Policy
SourceMedia (c) 2006 DM Review and SourceMedia, Inc. All rights reserved.
SourceMedia is an Investcorp company.
Use, duplication, or sale of this service, or data contained herein, is strictly prohibited.