Managing Meta Data for Corporate Savings
Meta data is an incredibly valuable yet underused IT asset. Failing to leverage the meta data sitting within computer systems and applications in a migration to Web-based services will ultimately result in wasted time and money.
Meta data is often described as data about data. However, that definition is vague and incomplete. Looking at the equation "meta data-plus-data-equals-information," meta data can more succinctly be described as data that expresses the context or relativity of data.
In practical terms, meta data describes critical factors about systems and applications, such as where a particular data source is located and the data types that are used by these systems and applications. Thorough understanding of the meta data sitting within systems is crucial to lowering software development and maintenance costs as well as unlocking a "lazy" asset: the data itself.
Meta data enables better understanding of corporate data and systems. More than providing documentation about how the system runs, it tells companies where the system is running and where the physical resources being used by the system are located. When meta data is accessible, applications become easier to maintain and (if necessary) replace. Additionally, meta data helps spot potential pitfalls and errors, such as - in the case of the Y2K challenge - finding a date field that cannot support a change in century.
Meta Data as a Key to Transformation
For a major credit card issuer, meta data recently became the cornerstone of a complete data management transformation. The company needed to significantly reduce the time required to integrate acquired credit card portfolio data into the existing infrastructure. By capturing key integration meta data - such as business vocabulary, entity relationship models, transformation models and Web services definitions - the company was able to quickly identify the mapping from the new portfolio into the existing data structures, as well as provide the ability to dynamically execute those mappings on demand. So far, the process has been reduced from 9-12 months down to two-to-three months.
Extensible markup language (XML) played a key role in the design of this data management strategy. First, all the integration meta data was captured in XML vocabularies. Common warehouse metamodel (CWM) was used to capture the entity relationship and transformation models. Web services definition language (WSDL) and universal definition and discovery interface (UDDI) vocabularies were used to capture Web services meta data. Finally, the business vocabulary was captured using ontology vocabularies, such as resource definition framework (RDF) and Web ontology language (OWL).
It is worth noting that this meta data capture environment's use of XML standards allowed it to deliver a collaborative design environment for data warehouse, business intelligence and transformation modeling - where each used a different application.
The Business Value of Meta Data
When a company gains a thorough understanding of the data it possesses, it can then intelligently decide what benefit that data provides to the organization. Moreover, when the meta data is made available to all appropriate corporate personnel, the outgrowth is often new and innovative ways to use that data.
For example, if the IT department is the only group that has access to the meta data, the integrations and reports developed will be limited by the scope of IT's understanding of the business. However if the business manager for new account development has access to a source of well-defined meta data, then he/she can work with IT to identify how to formulate a campaign for attracting new customers. This reuse and sharing of data and metadata leads to lower costs for software development, implementation, maintenance, and increases the opportunity for standardization of information across the company.
Finding Meta Data within the Organization
While it is clear that compelling business drivers exist for identifying and making meta data more widely available, a serious obstacle remains: meta data is everywhere! Therefore, knowing how to filter and limit meta data is a required skill. To illustrate the diversity of the world of meta data, here is just a short list of places where companies can extract meta data:
- Legacy mainframe systems
- Relational databases
- Hierarchical databases
- Object-oriented applications
- Logical models
- Enterprise resource planning software
- Office automation documents
Successful meta data mining requires a carefully thought-out approach that entails looking beyond the usual places listed above. The scope of the project must identify which meta data components are critical for a particular goal. If a cursory approach is used to collect meta data, the size of the data set can become overwhelming and difficult to control or use. On the other hand, if the data set is overanalyzed, important meta data may be discarded because it is viewed as data, not meta data.
One area in which this is readily being seen is in the advancement and deployment of radio frequency identification (RFID). RFID not only has the propensity to create a large volume of new data, it also has the capability to generate a need for supporting meta data. For example, RFID can be as simple as identifying a palette of soft drink bottles in a warehouse, or it can be as complex as identifying a single bottle within that palette along with that bottle's complete lineage from the minute it was filled. Supporting this latter requirement requires far more meta data than the former identification requirement. Companies need to analyze the balance of meta data so that the project doesn't yield an overwhelming or underwhelming amount of data.
Putting Meta Data to Work
Part of a company's commitment to capturing meta data requires two additional decisions to be made. The first is where should the meta data be stored; the second is how meta data will be made available to those who need it.
In terms of storage, the most obvious solution is to use a meta data repository. This is a specialized data management application designed to provide the infrastructure and support for storage of interrelated components of information. Meta data repositories not only help capture information about singular meta data components, but also the relationships between individual components.
For example, most meta data repositories will capture that there is a field called "address" in a relational database stored on a UNIX server called "Jade" and that there is a field called "address" on a mainframe 3270 screen.
Leading meta data management solutions provide a way to identify that the two address fields in the example above are related, such as they are same or that one becomes the other through a batch process. Knowing that the fields exist only helps gain control over a small quantity of data. It is knowing the relationships that exist that helps to track and understand how data flows and is used within organizations.
In addition to providing data lineage - which defines the history of the data, including where it came from and what other pieces of data are derived from this data or used to create this data - meta data repositories also provide important functionality for searching the available metadata and producing impact analyses. Impact analyses identify all the resources that rely on a particular piece of meta data and, therefore, assist in defining all the resources that would be impacted by a change in the location or type of data associated with a meta data component. Producing these types of reports, however, requires a dedication to inputting and updating all the information in the repository.
XML and Meta Data Management
In addition to using standard XML vocabularies to assist in the capture and sharing of meta data, XML is also a strong candidate for the design and storage of meta data.
XML is becoming a growing favorite of enterprises as an answer to both storage and access to meta data. XML is a tag-based language that allows users to demarcate items of data through the use of named tags called elements. Elements can contain supplemental information called attributes that assign values to uniquely named keys. XML provides a way to represent both meta data and data combined in the same document as illustrated in Figure 1.
Figure 1: XML Represents Data and Meta Data in Same Document
Figure 1 shows how elements clearly define the data that they are demarcating and provides additional meta data that helps to clarify the information. For example, Total_Price identifies the currency for the amount ensures that processing will occur in U.S. dollars. Whether meta data is captured along with the data in the document or the document is the meta data itself, XML is a simple and platform-neutral data format that can easily be processed by a number of tools and products.
Meta data can and should play an enormous role in facilitating application and data sharing throughout the enterprise. By understanding what meta data lives in the corporate system and sharing that data across the enterprise, organizations can gain new understanding of corporate assets, both intellectual and IT as well as drive down the costs of maintenance and increase the return on the data asset over time.
For more information on related topics visit the following related portals...
Meta Data and
JP Morgenthal is managing partner for Avorcor, an IT consultancy that focuses on integration and legacy modernization. He is also author of Enterprise Information Integration: A Pragmatic Approach. Questions or comments regarding this article can be directed to JP via e-mail at firstname.lastname@example.org. Do you have and idea for a future Enterprise Architecture column? Send it to JP; and if it is used, you will win a free copy of his book.
Provided by IndustryBrains
|Verify Data at the Point of Collection: Free Trial|
Protect against fraud, waste and excess marketing costs by cleaning your customer database of inaccurate, incomplete or undeliverable addresses. Add on phone check, name parsing and geo-coding as needed. FREE trial of Data Quality dev tools here.
|Design Databases with ER/Studio: Free Trial|
ER/Studio delivers next-generation data modeling. Multiple, distinct physical models based on a single logical model give you the tools you need to manage complex database environments and critical metadata in an intuitive user interface.
|Free EII Buyer's Guide|
Understand EII - Trends. Tech. Apps. Calculate ROI. Download Now.
|Data Mining: Levels I, II & III|
Learn how experts build and deploy predictive models by attending The Modeling Agency's vendor-neutral courses. Leverage valuable information hidden within your data through predictive analytics. Click through to view upcoming events.
|Click here to advertise in this space|