Published in DM Review Online in April 2004.|
Printed from DMReview.com
Knowledge: The Essence of Meta Data: Meta Data: A Tradition Like No Otherby R. Todd Stephens
April marks the beginning of spring for Georgia residents and the whole golfing community. The pollen is back, the azaleas are in full bloom and the dogwoods have painted the hills with white blossoms. April not only brings on a transition from winter to spring but also an event that really starts the golfing season. Augusta National and the Masters are about perfection and without a doubt this year's tournament was masterful. The Masters is the only golfing event that I watch each year. There are no commercials and no electronic scoreboards; the tournament itself really begins on Sunday at Amen Corner. Similar to the data chain described this month, organizational value begins with meta data. This month I wanted to use a little golf analogy in describing some of the basics around data, information, knowledge and wisdom.
The Basics of Data
One of the most challenging aspects of being a data architect is the task of explaining the foundational definitions of data, information, knowledge and wisdom. The main issue is that everyone already knows what these terms mean. Yet, sometimes putting that into words can be a challenge. The best alternative is to use an example and explain along the way. Data can be viewed as a collection or sequence of symbols. These symbols can either be quantified or are quantifiable in nature. A quantified data element might resemble the number 84, while a quantifiable element might be an object such as an image, function or sound. These can be stored digitally as a series of zeros and ones. However, what is interesting is the fact that at this level data has very little meaning. While you might recognize the sequence, you have little or no understanding of the context, usage, validity or reliability. What does the number 84 mean as a data value? Could it be a birth year? Could it be the last two digits of a Social Security number? Could it be someone's P.O. Box number? A lumber company? The reality is that the number 84 could mean anything.
Often, we define data as the lowest possible representation of a fact or concept. Data is usually stored in a format where we can understand the value but not necessarily the meaning. That is, assuming we utilize a base construct like the alphabet or a basic numbering scheme. However, even that basic knowledge can be lost if data is looked at as a binary number string versus the base 10 number that we are all familiar with.
In a not so technical example, take the game of golf. The data elements here consist of a ball, tee, club, bag, flagstick, sand, a hole in the ground and plenty of markers. Imagine having never seen a golf tournament, never reading the rules of golf or even hearing of Jack or Arnold. Where would you start? Do you hit the ball or throw it? Do you dribble the ball or kick it? If so, which way do you play, tee to green or green to tee? Can you block the other golfers' shots or make distracting sounds while they swing the club? It sounds kind of silly to try to play the game without understanding the rules, objectives or goals. Yet, many projects within the information technology (IT) field are attempted in this haphazard manner.
Information can be described as a collection of facts or elements of data that are related in some form or fashion. Information is data with meaning and understanding. For data to be transformed to information it must be related to other data components. Patterns and relationships in the data must be discovered, assimilated, related and discussed so that the data is made informative. A technical challenge for meta data management is to infer knowledge from multiple information sources.
Returning to our not so technical example, the information on golf is contained in the USGA rules. The USGA and The Royal and Ancient Golf Club of St. Andrews, Scotland, jointly write and interpret the rules of golf to guard the tradition and integrity of the game. By and large, you can learn all of the information that describes the details of the game. Understanding the principles of the game doesn't make you a great golfer and knowledge of the game will only come from practice. Although your score will not be anything to brag about, you will at least be able to enjoy the game and not hurt anyone. Do not worry about hitting the ball in the woods, anyone can make par from the fairway. Real golfers explore each and every hole, tee to green, sand to tree and swamps to marshes.
Knowledge is King
Knowledge is simply information placed into context. For example, if a data element has an accuracy rate of 80 percent then what should you do to the person responsible for the data? Should you fire him or give him a promotion? If the 80 percent field is the key structure of your system or application then someone needs to answer on how that bad data got into the system and what is going to be done about it. If that accuracy rate is for the street address then congratulations on a job well done. Most databases that contain address information rarely get above a 70 percent accuracy rate.
Technology has advanced to the point where we can store petabytes of data (petabyte = quadrillion bytes). In addition, the systems can assemble data at gigahertz speeds. We can store, transfer and perform transactions in phenomenal ways, but what has been done to the process of creating, managing and storing knowledge? Much of the knowledge management area focuses on this issue. Perhaps, one of the biggest problems is that creating relationships requires an enormous effort which brings into question the validity of the results. Can you trust the results of a search engine to bring you the most valid and useful sites? Most research methodologies require multidimensional validation of the results. Should this rigor be applied to knowledge management? Of course, meta data should play an expanded role in this transformation from information to knowledge.
Knowledge is about defining "how." Defining how the golf swing should be approached or the fundamentals of sand play versus the fairway is not easy to explain. Learning the fundamentals of the game begins to have an immediate impact on the score or, in the case of business, the bottom line. Knowledge is where we can learn how to play the game. In addition, we can begin to distinguish between accuracy and reliability and the ultimate goal of obtaining both. For example, if I hit 10 shots scattered short, long, left and right, but the average distance from the hole is two feet then one can say that the shots are accurate, but not reliable. If I hit those same 10 shots 20 feet short of the pin but within two to three feet of each other, then I can say that I have obtained a reliable golf shot, just not accurate. Eventually, you will need both accuracy and reliability to be successful at the game and this is true in your meta data management implementation.
Wisdom: Learning from the Past
Wisdom is experience gained over a period of time which can be applied toward future decisions. Wisdom comes from various lines of knowledge where the individual spots patterns, results and trends. Wisdom aligns our judgment for the future.
Wisdom can tell the CIO that an 85 percent accuracy rate in the e-mail addresses can open up a new value chain by utilizing the online environment. Adding geographic knowledge may provide the opportunity to customize a product or service offering which will increase revenue for the corporation.
In golf, wisdom takes a very long time to develop. You are 225 yards out on a par five, the green is surrounded by bunkers with a creek running along the fairway that cuts in front of the green. Do you pull out the three-wood or lay up with the five-iron? Knowledge says you can hit the three-wood but wisdom says that you will make birdie 50 percent of the time laying up while going for the green only pays off 35 percent of the time. My father was right when he said, "The wisdom of golf cannot be found on the day of the tournament. If you didn't bring it with you then you, won't find it here."
Wisdom takes time to develop and a lifetime to master. The realization for those of us in meta data is that meta data is a journey that has no end. Success is staying in the game and delivering quality information that can drive unique business value.
R. Todd Stephens, Ph.D. is the director of Meta Data Services Group for the BellSouth Corporation, located in Atlanta, Georgia. He has more than 20 years of experience in information technology and speaks around the world on meta data, data architecture and information technology. Stephens recently earned his Ph.D. in information systems and has more than 70 publications in the academic, professional and patent arena. You can reach him via e-mail at Todd@rtodd.com or to learn more visit http://www.rtodd.com/.
Copyright 2005, SourceMedia and DM Review.