|Sign-Up for Free Exclusive Services:||Portals|||||eNewsletters|||||Web Seminars|||||dataWarehouse.com|||||DM Review Magazine|
|Covering Business Intelligence, Integration & Analytics||Advanced Search|
Knowledge: The Essence of Meta Data:
Several months ago, I was addressing a group at a data conference in Philadelphia. At the end of the session, a gentleman approached me and said, "You should change the name of your group, you guys are doing much more than meta data." Change the name of our group? Change the name of the Meta Data Services Group (MSG)? MSG is our brand, MSG is our passion, MSG is what we are all about. That being said, he had a point that I contemplated all the way back to Atlanta. I realized that the walls of meta data are very short indeed. Meta data spills over into a wide variety of subjects, concepts and disciplines. This neighborhood of philosophies comes together to form the five disciplines of data. Figure 1 provides a very high-level view of these disciplines. Keep in mind that I am looking at these subject areas through my rose covered meta-glasses; you will notice the study of structure is in the point position.
Figure 1: Five Disciplines of Data
The five disciplines of data include:
The study of structure is a rather broad definition that focuses on the basic construct of information. Enterprise meta data has received a lot of attention and press over the past few years, as many organizations have attempted to push data warehouse success to the enterprise level. Unfortunately the integrated tools, controlled environment and high degree of quality assurance are much harder to find at the enterprise level. Enterprise data architects are looking for alternative methods of data integration. Perhaps, enterprise meta data holds the key. Historically, definitions of meta data have described structural aspects of mechanisms that house data such as table, column and attribute characteristics of a relational database management system (RDBMS). Additionally, at times, meta data, as a concept, has included descriptions of the state of data and enumerating and summarizing various views of data.
Today, with the advent of technologies such as hypermedia and heuristically based searching and indexing, a new, broader, more generic definition of meta data is needed. This definition should include the traditional concepts, but it should add the concepts of existence, perspective, modeling and topicality. A new definition should recognize that much, if not most, of enterprise data is not found in traditional RDBMs, but rather, it is found in the myriad technological assets and views of those assets that exist at any point in time. My definition of meta data is as follows:
"Meta data is structured, semi-structured and unstructured data which describes the characteristics of a resource (external source) or asset (internal source). Meta data is about knowledge, which is the ability to turn information and data into effective action."
Now any definition without context is not of much use, especially, one that covers a wide spectrum of concepts. However, the basic belief is that structure is at the heart of our ability to understand complex thoughts, including those of information.
Data is one of the top assets of a corporation. Many decision-makers require only "data and fact" to solve problems a corporation faces. However, these decision-makers cannot easily make decisions when there are inconsistencies in data. Therefore, the goal of data architecture is to provide the framework where data quality is improved by managing data as an asset.
Data architecture is the blueprint for the lifecycle of business information. The lifecycle includes the structure of the information (as represented by data models and databases) as well as the business events and movement of the information. This view of data provides a better foundation for transforming a silo-oriented set of existing applications into systems that fit into the organizational model.
Data architecture is based on business goals and objectives, technical goals and objectives, and the desired state definitions of the other technical (e.g., platform) and functional architectures (i.e., applications). It is the corporation's expression of strategy for creating and managing the use of data in order to transform data into information. Recognizing that data is a strategic asset that is expensive to handle and easy to waste, the data architecture must assure:
The data architecture focuses on the data resource management of information which should include data quality, data content, data usage, data access, data storage management and data modeling. As you can see from the first two areas, these dimensions overlap each other greatly. Don't worry this blurriness gets worse as we bring in the next three areas.
Information Architecture: The Study of Human/Computer Interaction
With my deepest apologies to the human computer interaction (HCI) industry, I am going to simplify the entire field of study into the concepts of information architecture. The Asilomar Institute for Information Architecture defines information architecture as the structural design of shared information environments that include the art and science of organizing and labeling Web sites, intranets, online communities and software to support usability and findability. Information architecture is an emerging community of practice focused on bringing principles of design and architecture to the digital landscape. The information architecture project at MIT seeks to create information spaces, where people will use this awareness to search, browse and learn. In the same way that they navigate in the physical environment, they will navigate through knowledge. Take a look at my article published in The Data Administration Newsletter (TDAN.com) which discusses, in exhaustive detail, the concepts around usability and their importance.
Knowledge Management: The Study of Un-Structure
Knowledge management with the additional practice of capturing the tacit experience of the individual should be shared, used and built upon by the organization leading to increased productivity. The study of un-structure provides us with the means of capturing, storing and disseminating knowledge through out the organization.
Knowledge management starts with the individual and moves through an organization. Every individual uses knowledge management tools - including personal memory, date books, notebooks, file cabinets, e-mail archives, calendars, post-it notes, bulletin boards, newsletters, journals and restaurant napkins. Knowledge management begins when an organization enables individuals to link their personal knowledge management systems with organizational knowledge management systems. (Richardson, 2001). Knowledge is about the "how," and much of this information is stored in various objects as described above.
Content Management: The Study of the Lifecycle of Data
Content management is simply the lifecycle of data, information and knowledge. Content management can mean different things to different people. Fundamentally it is the process of attaining content, working with, publishing and retiring it. In its most basic form, content management systems should allow each content producer to create assets and feed them to the publishing system. The system should have customized and automated checks and balances to ensure that information gets placed correctly, that navigation trees are created and maintained, and that the appropriate people control the process along the way. To make this happen, good content management packages separate content (written material, images, streaming audio and anything else that makes up an asset) from presentation of content, and they include strong workflow capabilities (Wolf, 2003).
When I showed Figure 1 to a couple of other knowledge management architects they responded with the expected groans and body language that shouted disapproval. Then, I walked through an example of how we might build a repository and the impact of each area. Clearly, a repository will contain the general structured information that describes the asset (i.e., Dublin Core), the object specific structured information (i.e., OMG) and the myriad of relationships that surround the asset (meta data architecture). This information will be brought in, categorized, utilized and then retired as new and updated information is loaded (content management). We will add supporting documents, scan schedules and user guides as a collection of unstructured information sources (knowledge management). The underlying meta-model will be modeled using our entity relationship tool and the data will be sorted into specific categories as defined by the repository librarian (data architecture). Finally, the repository itself will be designed based on a solid set of human computer interaction principles (information architecture). As meta data professionals, we cannot ignore the other disciplines and more importantly we should embrace their utility and value.
By the end of the conversation, the only thing they wanted to change was to switch meta data architecture and knowledge management as the center of the disciplines of data. Sorry, I only have one pair of rose colored meta- glasses.
Do you still see meta data as "data about data?" Can you imagine meta data as one of the core disciplines of data? Better yet, do you see why the study of structure is at the center of the model? The reality is that without structure the other four are meaningless!
R. Todd Stephens, Ph.D. is the director of Meta Data Services Group for the BellSouth Corporation, located in Atlanta, Georgia. He has more than 20 years of experience in information technology and speaks around the world on meta data, data architecture and information technology. Stephens recently earned his Ph.D. in information systems and has more than 70 publications in the academic, professional and patent arena. You can reach him via e-mail at Todd@rtodd.com or to learn more visit http://www.rtodd.com/.
|E-Mail This Column|