-
Marketplace
-
Channel Resources
Articles from this Site
Pillar Data Systems Extends Application-Aware Storage Capabilities
AIM Software Announces New Release of GAIN
Lockheed Martin Deploys Aperture
Volante Earns New SWIFT Label
Community Medical Centers Sign Lawson Software Contract
White Papers
Pragmatic Approach to Compliance Data Collation
Informatica - Handling Variable Length Files Using XML
Putting Metadata to Work to Achieve the Goals of Data Governance
Enterprise Information Management - Insights and Strategies into the Direction of EIM
Automated Analysis Technology
Web Seminars
Making the Business Case for Predictive Analytics: Innovative Strategies for Maximizing ROI
Master Data Management: Best Practices for Success
Modeling Unstructured Data
Espresso Shot Web Seminar - Spread the Wealth:How to Make BI Pervasive
Creative Strategies for Achieving 24/7 Uptime
Books
Data Management: Databases and Organizations, 3rd Edition
Data Modeler's Workbench: Tools and Techniques for Analysis and Design
Effective Databases for Text & Document Management
Mobile Handheld Devices - Enabling Enterprise Communications and Data Management
Mobile Data Management (MDM 2002), 3rd International Conference
The Roach Motel Repository
MetaThoughts
The occasional bout of insomnia inevitably means watching late-night television. As time ticks by, the advertisements get longer and are for a more eclectic range of products, seemingly designed for an audience that might not be able to assimilate more sophisticated messages. One of my favorites is the pitch for Black Flag's "Roach Motel" product, a bait trap for cockroaches that is marketed using the slogan "The roaches check in, but they never check out."
This advertisement has a strange resonance with some of the ways we tend to do things in data administration. Are we guilty of building processes, services and infrastructures that serve to trap knowledge that can never get out again?
The Data Model
In some ways, a data model has an uncanny resemblance to the roach motel. A data model is a very large batch container. It is usually thought of as a thing in itself that is managed through defined processes. Indeed, projects are often predicated on waiting for an entire data model to be finished, rather than expecting a steady stream of completed entities, attributes and relationships.
Many processes and standards reinforce the reality of the data model as a batch container. One is to begin a data modeling project with the expectation of having a formal review and sign-off process after the model is completed. At first glance, this sounds eminently reasonable and maps to how we do many other things on projects. Frankly, it is easier to think of a data model as a single thing, and it is definitely easier to manage a single thing than the diverse set of metadata concepts and instances that exist within a data model.
This approach is by no means universal, and many projects do work with intermediate deliverables from models. However, it is not uncommon, and when it happens it forces everyone to wait until the data model is "complete." This may take months, and the formal review process usually adds more time because there is a lot to digest in a data model. In the intervening period, the things that are of use within the data model are not used by those who may need them. The data model becomes a knowledge trap.
Of course, not everybody runs a data analysis project like this, but in the data world, it is fair to say that there is considerable acceptance of data modeling as a core competency. This mode of thought decouples the process of data modeling from the goal of continuously sharing knowledge about data within an organization. To put it another way, the problem arises by looking inward and valuing a data model only in data modeling terms, rather than looking outward to the rest of IT and the business beyond, and trying to pump out knowledge about the enterprise's information resources.
Hidden Secrets
Another unhelpful facet of data models is that they are typically created in tools whose licensing requirements restrict them to a very small numbers of individuals. Furthermore, these tools utilize notations that are not intuitive. This means that anything that goes into a data model is going to be inaccessible to anyone other than a data modeler.
The reality is that knowledge gleaned about data is going to be put first into a data model and nowhere else. If the data modelers are asked why they do this, the reply is often a puzzled look and a retort that this is what data models are for and there is no other approach.
The value of a logical data model is that it represents the data as the business truly sees it. Acceptance of this viewpoint means that a logical data model is intended for knowledge sharing. The analysis to get the normalization, cardinalities, optionalities, definitions, etc. requires a considerable effort, and focusing on the concepts and tools required to produce these artifacts is perfectly reasonable. What is not reasonable is to lose sight of why all this is being done. Producing the logical data model and doing nothing more than tossing it to the individuals who will make it physical is an enormous waste.
The content of logical data models must be shared, not just after they are complete, but as they are being produced. Unfortunately, data models are utterly useless for knowledge sharing except among a tiny group of specialists. If data administration is to be successful in the future, it urgently needs to tackle these issues.
The Repository
There are other approaches, and we all know about the metadata repository. In theory, data models should contribute their metadata to an enterprise-wide repository. However, herein lie more issues. Firstly, data models have the useful property of being project-level artifacts. An enterprise-wide repository is a different beast. It needs to exist at a higher organizational level and be part of a sustained program. Not only does it have to deliver value across the years, it has to deliver it widely across the enterprise.
This approach is often simply ignored. IT staff often set up repositories from a perspective of what "best practices" or some higher form of data modeling should dictate. They rarely go out to the business and build use cases for atomic pieces of functionality to deliver (or manage) knowledge about the enterprise's information assets. This leads to a high probability that repositories will not be used. It is perhaps fear of this that leads to access to repositories being restricted to IT staff.
Malcolm Chisholm is an independent consultant focusing on meta data engineering and data management. He is the author of How to Build a Business Rules Engine and Managing Reference Data in Enterprise Databases and frequently writes and speaks on these topics. Chisholm runs two Web sites http://www.bizrulesengine.com and http://www.refdataportal.com. You can contact him at mchisholm@refdataportal.com.
For more information on related topics, visit the following channels:


